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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5116 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID.NO:l: 
ACAGCGTTCT CTTAATACTA GTACAAACCC ACAATAAAAT ATGACAAACA ACAATTACAA 
CACCTTTTTT GCAGTCTATA TGCAAATATT TTAAAAAATA GTATAAATCC GCCATATAAA 
ATGGTATAAT CTTTCATCTT TCATCTTTCA TCTTTCATCT TTCATCTTTC ATCTTTCATC 



TTTCATCTTT 
ACATGCCCTG 
AACGCAAATG 
TATATCGTCT 
GGGGTTGTGA 
ACTTAGCGTT 
AATCTGTTTT 
AAGTAGATGG 
AATTTAACAT 
TATTCAACCG 
GACAAGTCTT 
CTAATGGCTT 
TCACCTTCGA 
CTGTCGGTAA 
TTAGCGTAAA 
TAATAAACCC 
GCGATATTTT 
GTAAACTTTC 
AAGAGGGTGA 
GCAAGCTGAT 
CAGGTAAAGA 
GCATTCAATT 
AAGAAAAAGG 
ACGCTCAAGG 
ATTTATTCAT 
ATGTATCTAT 
CGGGATCCGG 
ACACAACTCT 
GCATCTATGT 
GTCGGAGCGG 
GTGCAAACTT 
GGGCGCAAGG 
ACCAAGTCAT 
ATAATGTCTC 



CATCTTTCAT 
ATGAACCGAG 
ATAAAGTAAT 
CAAATTCAGC 
CCATTCCACA 
AAAGCCACTT 
AGCAAGCGGC 
TAATAAAACC 
CGACCAAAAT 
TGTTACATCT 
TTTAATCAAC 
TACGGCTTCT 
GCAAACCAAA 
AGACGGCAGT 
TGGTGGCAGC 
AACCATTACT 
TGCCAAAGGC 
TGCTGATTCT 
AGCGGAAATT 
GATTACAGGC 
AGGGGGAGAA 
AGCAAAGAAA 
CGGACGCGCT 
TAGTGGTGAT 
CAAAGACAAT 
TAATGCAGAA 
GAATAGTGCC 
TGAGAGTATA 
CAATAGCTCC 
TGGCGGCGTT 
AACAATTTAC 
TAACATAAAC 
TACAGGTCAA 
TCTAAACGGC 



CTTTCATCTT 
GGAAGGGAGG 
TTAATTGTTC 
AAACGCCTGA 
GAAAAAGGCA 
TCCGCTATGT 
TTACAAGGAA 
ATTATCCGCA 
GAAATGGTGC 
AACCAAATCT 
CCAAATGGTA 
ACGCTAGACA 
GATAAAGCGC 
GTAAATCTTA 
ATTTCTTTAC 
T AC AG CATTG 
GGTAACATTA 
GTAAGCAAAG 
GGCGGTGTAA 
GATAAAGTCA 
ACTTACCTTG 
ACCTCTTTAG 
ATTGTGTGGG 
ATCGCTAAAA 
GCAATTGTTG 
ACAGCAGGAC 
AGCACCCCAA 
CTAAAAAAAG 
ATTAATTTAT 
GAGATTAACA 
TCAGGCGGCT 
ATTACAGCTA 
GGGACTATTA 
ACTGGCAGCG 
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TCATCTTTCA 
GAGGGGCAAG 
AACTAACCTT 
ATGCTTTGGT 
GCGAAAAACC 
TACTATCTTT 
TGGATGTAGT 
ACAGTGTTGA 
AGTTTTTACA 
CCCAATTAAA 
TCACAATAGG 
TTTCTAACGA 
TCGCTGAAAT 
TTGGTGGCAA 
TCGCAGGGCA 
CCGCGCCTGA 
ATGTCCGTGC 
ATAAAAGCGG 
TTTCCGCTCA 
CATTAAAAAC 
G CGGTGACG A 
AAAAAGGCTC 
GCGATATTGC 
CCGGTGGTTT 
ACGCCAAAGA 
GCAGCAATAC 
AACGAAACAA 
GTACCTTTGT 
CCAATGGCAG 
ACGATATTAC 
GGGTTGATGT 
AACAAGATAT 
CCTCAGGCAA 
GACTGCAATT 



TCTTTCATCT 
AATGAAGAGG 
AG G AG AAAAT 
TGCTGTGTCT 
TGCTCGCATG 
AGGTGTAACA 
ACACGGCACA 
CGATATCATT 
AGAAAACAAC 
AGGGATTTTA 
TAAAGACGCA 
AAACATCAAG 
TGTGAATCAC 
AGTGAAAAAC 
AAAAATCACC 
AAATGAAGCG 
TGCCACTATT 
CAATATTGTT 
AAATCAGCAA 
AGGTGCAGTT 
GCGCGGCGAA 
AAC CAT CAAT 
GTTAATTGAC 
TGTGGAGACG 
GTGGTTGTTA 
TTCAGAAGAC 
AGAAAAGACA 
TAACATCACT 
CTTAACTCTT 
CACCGGTGAT 
TCATAAAAAT 
CGCCTTTGAG 
TCAAAAAGGT 
CACCACTAAA 



TTCATCTTTC 
GAGCTGAACG 
ATGAACAAGC 
GAATTGGCAC 
AAAGTG CGTC 
TCTATTCCAC 
GCCACTATGC 
AATTGGAAAC 
AACTCCGCCG 
GATTCTAACG 
ATTATTAACA 
GCGCGTAATT 
GGTTTAATTA 
GAGGGTGTGA 
ATCAGCGATA 
GTCAATCTGG 
CGAAACCAAG 
CTTTCCGCCA 
GCTAAAGGCG 
ATCGACCTTT 
GGtAAAAAGG 
GTATCAGGCA . 
GG CAAT ATT A 
TCGGGGCATG 
GACCCGGATA 
GATGAATACA 
ACATTAACAA 
GCTAATCAAC 
TGGAGTGAGG 
GATACCAGAG 
ATCTCACTCG 
AAAGGAAGCA 
TTTAGATTTA 
AGAACCAATA 



240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
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AATACGCTAT CACAAATAAA TTTGAAGGGA CTTTAAATAT TTCAGGGAAA GTGAACATCT 
CAATGGTTTT ACCTAAAAAT GAAAGTGGAT ATGATAAATT CAAAGGACGC ACTTACTGGA 
ATTTAACCTC CTTAAATGTT TCCGAGAGTG GCGAGTTTAA CCTCACTATT GACTCCAGAG 
GAAGCGATAG TGCAGGCACA CTTACCCAGC CTTATAATTT AAACGGTATA TCATTCAACA 
AAGACACTAC CTTTAATGTT GAACGAAATG CAAGAGTCAA CTTTGACATC AAGGCACCAA 
TAGGGATAAA TAAGTATTCT AGTTTGAATT ACGCATCATT TAATGGAAAC ATTTCAGTTT 
CGGGAGGGGG GAGTGTTGAT TTCACACTTC TCGCCTCATC CTCTAACGTC CAAACCCCCG 
GTGTAGTTAT AAATT C T AAA TACTTTAATG TTTCAACAGG GTCAAGTTTA AGATTTAAAA 
CTTCAGGCTC AACAAAAACT GGCTTCTCAA TAGAGAAAGA TTTAACTTTA AATGCCACCG 
GAGGCAACAT AACACTTTTG CAAGTTGAAG GCACCGATGG AATGATTGGT AAAGGCATTG 
TAGCCAAAAA AAACATAACC TTTGAAGGAG GTAACATCAC CTTTGGCTCC AGGAAAGCCG 
TAACAGAAAT CGAAGGCAAT GTTACTATCA ATAACAACGC TAACGTCACT CTTATCGGTT 
CGGATTTTGA CAACCATCAA AAACCTTTAA CTATTAAAAA AGATGTCATC ATTAATAGCG 
GCAACCTTAC CGCTGGAGGC AATATTGTCA ATATAGCCGG AAATCTTACC GTTGAAAGTA 
ACGCTAATTT CAAAGCTATC ACAAATTTCA CTTTTAATGT AGGCGGCTTG TTTGACAACA 
AAGGCAATTC AAATATTTCC ATTGCCAAAG GAGGGGCTCG CTTTAAAGAC ATTGATAATT 
CCAAGAATTT AAGCATCACC ACCAACTCCA GCTCCACTTA CCGCACTATT ATAAGCGGCA 
ATATAACCAA TAAAAACGGT GATTTAAATA TTACGAACGA AGGTAGTGAT ACTGAAATGC 
AAATTGGCGG CGATGTCTCG CAAAAAGAAG GTAATCTCAC GATTTCTTCT GACAAAATCA 
ATATTACCAA ACAGATAACA ATCAAGGCAG GTGTTGATGG GGAGAATTCC GATTCAGACG 
CGACAAACAA TGCCAATCTA ACCATTAAAA CCAAAGAATT GAAATTAACG CAAGACCTAA 
ATATTTCAGG TTTCAATAAA GCAGAGATTA CAGCTAAAGA TGGTAGTGAT TTAACTATTG - 
GTAACAC CAA TAGTGCTGAT GGTACTAATG CCAAAAAAGT AACCTTTAAC CAGGTTAAAG 
ATTCAAAAAT CTCTGCTGAC GGTCACAAGG TGACACTACA CAGCAAAGTG GAAACATCCG 
GTAGTAATAA CAACACTGAA GATAGCAGTG ACAATAATGC CGGCTTAACT ATCGATGCAA 
AAAATGTAAC AGTAAACAAC AATATTACTT CTCACAAAGC AGTGAGCATC TCTGCGACAA 
GTGGAGAAAT TACCACTAAA ACAGGTACAA CCATTAACGC AACCACTGGT AACGTGGAGA 
TAACCGCTCA AACAGGTAGT ATCCTAGGTG GAATTGAGTC CAGCTCTGGC TCTGTAACAC 
TTACTGCAAC CGAGGGCGCT CTTGCTGTAA GCAATATTTC GGGCAACACC GTTACTGTTA 
CTGCAAATAG CGGTGCATTA ACCACTTTGG CAGGCTCTAC AATTAAAGGA ACCGAGAGTG 
TAACCACTTC AAGTCAATCA GGCGATATCG GCGGTACGAT TTCTGGTGGC ACAGTAGAGG 
TTAAAGCAAC CGAAAGTTTA ACCACTCAAT CCAATTCAAA AATTAAAGCA ACAACAGGCG 
AGGCTAACGT AACAAGTGCA ACAGGTACAA TTGGTGGTAC GATTTCCGGT AATACGGTAA 
ATGTTACGGC AAACGCTGGC GATTTAACAG TTGGGAATGG CGCAGAAATT AATGCGACAG 



2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3 840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 
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AAGGAGCTGC 


AACCTTAACT 


ACATCATCGG 


GCAAATTAAC 


TACCGAAGCT 


AGTTCACACA 


4320 


TTACTTCAGC 


CAAGGGTCAG 


GTAAATCTTT 


CAGCTCAGGA 


TGGTAGCGTT 


GCAGGAAGTA 


4380 


TTAATGCCGC 


CAATGTGACA 


CTAAATACTA 


CAGGCACTTT 


AACTACCGTG 


AAGGGTTCAA 


4440 


AC ATTAATG C 


AACCAGCGGT 


ACCTTGGTTA 


TTAACGCAAA 


AG ACG CTG AG 


CTAAATGGCG 


4500 


CAGCATTGGG 


T AAC C AC AC A 


GTGGTAAATG 


CAACCAACGC 


AAATGGCTCC 


GGCAGCGTAA 


4560 


TCGCGACAAC 


CTCAAGCAGA 


GTGAACATCA 


CTGGGGATTT 


AATCACAATA 


AATGGATTAA 


4620 


ATATCATTTC 


AAAAAACGGT 


ATAAACACCG 


TACTGTTAAA 


AGGCGTTAAA 


ATTGATGTGA 


4680 


AATACATTCA 


AC CGGGTATA 


GCAAGCGTAG 


ATGAAGTAAT 


TGAAGCGAAA 


CGCATCCTTG 


4740 


AGAAGGTAAA 


AGATTTATCT 


GATGAAGAAA 


GAGAAGCGTT 


AGCTAAACTT 


GGAGTAAGTG 


4800 


CTGTACGTTT 


TATTGAGCCA 


AATAATACAA 


TTACAGTCGA 


TACACAAAAT 


GAATTTGCAA 


4860 


CCAGACCATT 


AAGTCGAATA 


GTGATTTCTG 


AAGGCAGGGC 


GTGTTTCTCA 


AACAGTG ATG 


4920 


GCGCGACGGT 


GTGCGTTAAT 


ATCGCTGATA 


ACGGGCGGTA 


GCGGTCAGTA 


ATTGACAAGG 


4980 


TAGATTTCAT 


CCTGCAATGA 


AGTCATTTTA 


TTTTCGTATT 


ATTTACTGTG 


TGGGTTAAAG 


5040 


TTCAGTACGG 


GCTTTACCCA 


TCTTGTAAAA 


AATTACGGAG 


AATACAATAA 


AGTATTTTTA 


5100 


ACAGGTTATT 


ATTATG 
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(2) INFORMATION FOR SEQ ID NO : 2 : 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1536 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Asn Lys lie Tyr Arg Leu Lys Phe Ser Lys Arg Leu Asn Ala Leii 
15 10 15 

Val Ala Val Ser Glu Leu Ala Arg Gly Cys Asp His Ser Thr Glu Lys 
20 25 ' 30 

Gly Ser Glu Lys Pro Ala Arg Met Lys Val Arg His Leu Ala Leu Lys 
35 40 45 

Pro Leu Ser Ala Met Leu Leu Ser Leu Gly Val Thr Ser lie Pro Gin 
50 55 60 

Ser Val Leu Ala Ser Gly Leu Gin Gly Met Asp Val Val His Gly Thr 
65 70 75 80 

Ala Thr Met Gin Val Asp Gly Asn Lys Thr lie lie Arg Asn Ser Val 
85 90 95 

Asp Ala lie lie Asn Trp Lys Gin Phe Asn lie Asp Gin Asn Glu Met 
100 105 110 

Val Gin Phe Leu Gin Glu Asn Asn Asn Ser Ala Val Phe Asn Arg Val 
115 120 125 
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Thr Ser Asn Gin He Ser Gin Leu Lys Gly He Leu Asp Ser Asn Gly 
130 i3S 140 

Gin Val Phe Leu He Asn Pro Asn Gly He Thr He Gly Lys Asp Ala 
145 • 150 155 i-GO 

He He Asn Thr Asn Gly Phe Thr Ala Ser Thr Leu Asp He Ser Asn 
165 170 175 

Glu Asn He Lys Ala Arg Asn Phe Thr Phe Glu Gin Thr Lys Asp Lvs 
180 i 8 5 19Q 

Ala Leu Ala Glu He Val Asn His Gly Leu He Thr Val Gly Lys Asp 
195 200 205 

Gly Ser Val Asn Leu He Gly Gly Lys Val Lys Asn Glu Gly Val He 
210 215 220 

Ser Val Asn Gly Gly Ser He Ser Leu Leu Ala Gly Gin Lys He Thr 
225 230 235 ~ 240 

He Ser Asp He He Asn Pro Thr He Thr Tyr Ser He Ala Ala Pro 
245 250 255 

Glu Asn Glu Ala Val Asn Leu Gly Asp He Phe Ala Lys Gly Gly Asn 
260 265 270 

He Asn Val Arg Ala Ala Thr He Arg Asn Gin Gly Lys Leu Ser Ala 
275 280 285 

Asp Ser Val Ser Lys Asp Lys Ser Gly Asn He Val Leu Ser Ala Lvs 
290 295 300 

Glu Gly Glu Ala Glu He Gly Gly Val He Ser Ala Gin Asn Gin Gin 
305 310 315 320 

Ala Lys Gly Gly Lys Leu Met He Thr Gly Asp Lys Val Thr Leu Lys 
325 330 335 

Thr Gly Ala Val He Asp Leu Ser Gly Lys Glu Gly Gly Glu Thr Tvr 
340 345 350 

Leu Gly Gly Asp Glu Arg Gly Glu Gly Lys Asn Gly He Gin Leu Ala 
355 360 365 

Lys Lys Thr Ser Leu Glu Lys Gly Ser Thr He Asn Val Ser Gly Lys 
370 375 380 

Glu Lys Gly Gly Arg Ala He Val Trp Gly Asp He Ala Leu He Asp 
385 3 90 395 400 

Gly Asn He Asn Ala Gin Gly Ser Gly Asp He Ala Lys Thr Gly Gly 
405 410 A 415 

Phe Val Glu Thr Ser Gly His Asp Leu Phe He Lys Asp Asn Ala He 
4 20 425 430 

Val Asp Ala Lys Glu Trp Leu Leu Asp Phe Asp Asn Val Ser He Asn 
435 440 445 

Ala Glu Thr Ala Gly Arg Ser Asn Thr Ser Glu Asp Asp Glu Tyr Thr 
450 455 460 

Gly Ser Gly Asn Ser Ala Ser Thr Pro Lys Arg Asn Lys Glu Lys Thr 
465 470 475 480 
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Thr Leu Thr Asn Thr Thr Leu Glu Ser He Leu Lys Lys Gly Thr Phe 
48S 490 495 

Val Asn lie Thr Ala Asn Gin Arg He Tyr Val Asn Ser Ser He Asn 

500 5 °5 510 



Leu Ser Asn Gly Ser Leu Thr Leu Trp Ser Glu Gly Arg Ser Gly Gly 

520 525 

Gly Val Glu He Asn Asn Asp He Thr Thr Gly Asp Asp Thr Arg Gly 



540 



Ala Asn Leu Thr He Tyr Ser Gly Gly Trp Val Asp Val His Lys Asn 

555 560 

He Ser Leu Gly Ala Gin Gly Asn He Asn He Thr Ala Lys Gin Asp 

565 570 S75 

He Ala Phe Glu Lys Gly Ser Asn Gin Val He Thr Gly Gin Gly Thr 

585 59 0 

lie Thr Ser Gly Asn Gin Lys Gly Phe Arg Phe Asn Asn Val Ser Leu 
" b 600 60S 

Asn Gly Thr Gly Ser Gly Leu Gin Phe Thr Thr Lys Arg Thr Asn Lys 

615 6 20 

Tyr Ala He Thr Asn Lys Phe Glu Gly Thr Leu Asn lie Ser Gly Lys 

630 635 640 

Val Asn He Ser Met Val Leu Pro Lys Asn Glu Ser Gly Tyr Asp Lys 
64S 650 655 

Phe Lys Gly Arg Thr Tyr Trp Asn Leu Thr Ser Leu Asn Val Ser Glu 
660 665 670 

Ser Gly Glu Phe Asn Leu Thr He Asp Ser Arg Gly Ser Asp Ser Ala 
675 680 685 

Gly Thr Leu Thr Gin Pro Tyr Asn Leu Asn Gly lie Ser Phe Asn Lys 
Asp Thr Thr Phe Asn Val Glu Arg Asn Ala Arg Val Asn Phe Asp lie 



71S 



720 



Lys Ala Pro He Gly He Asn Lys Tyr Ser Ser Leu Asn Tyr Ala Ser 
25 7 30 735 

Phe Asn Gly Asn He Ser Val Ser Gly Gly Gly Ser Val Asp Phe Thr 



745 750 

Leu Leu Ala Ser Ser Ser Asn Val Gin Thr Pro Gly Val Val He Asn 
755 760 765 

Ser Lys Tyr Phe Asn Val Ser Thr Gly Ser Ser Leu Arg Phe Lys Thr 
//u 77S 780 

Ser Gly Ser Thr Lys Thr Gly Phe Ser He Glu Lys Asp Leu Thr Leu 
78S 790 79S goo 

Asn Ala Thr Gly Gly Asn He Thr Leu Leu Gin Val Glu Gly Thr Asp 
805 810 81S 

Gly Met He Gly Lys Gly He Val Ala Lys Lys Asn He Thr Phe Glu 
820 825 830 
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Gly Gly Asn lie Thr Phe Gly Ser Arg Lys Ala Val Thr Glu He Glu 
835 840 845 

Gly Asn Val Thr He Asn Asn Asn Ala Asn Val Thr Leu He Gly Ser 
. 850 855 860 

Asp Phe Asp Asn His Gin Lys Pro Leu Thr He Lys Lys Asn Val He 
865 870 575 * 880 

He Asn Ser Gly Asn Leu Thr Ala Gly Gly Asn He Val Asn He Ala 
885 890 895 

Gly Asn Leu Thr Val Glu Ser Asn Ala Asn Phe Lys Ala He Thr Asn 
900 905 4 910 

Phe Thr Phe Asn Val Gly Gly Leu Phe Asd Asn Lys Gly Asn Ser Asn 
915 920 925 

He Ser He Ala Lys Gly Gly Ala Arg Phe Lys Asp He Asp Asn Ser 
930 935 940 

Lys Asn Leu Ser He Thr Thr Asn Ser Ser Ser Thr Tyr Arg Thr He 
945 950 c 55 ~ S6Q 

lie Ser Gly Asn He Thr Asn Lys Asn Gly Asp Leu Asn He Thr Asn 
965 970 975 

Glu Gly Ser Asp Thr Glu Met Gin He Gly Gly Asp Val Ser Gin Lys 
980 985 990 

Glu Gly Asn -Leu Thr He Ser Ser Asp Lys He Asn He Thr Lys Gin 
995 1000 • 1005 

He Thr He Lys Ala Gly Val Asp Gly Glu Asn Ser Asp Ser Asp Ala 
1010 1015 1020 

Thr Asn Asn Ala Asn Leu Thr He Lys Thr Lys Glu Leu Lys Leu Thr 
1025 1030 1035 1040 

Gin Asp Leu Asn He Ser Gly Phe Asn Lys Ala Glu He Thr Ala Lys 
1045 1050 - . 1055 

Asp Gly Ser Asp Leu Thr He Gly Asn Thr Asn Ser Ala Asp Gly Thr 
1060 1065 1070 

Asn Ala Lys Lys Val Thr Phe Asn Gin Val Lys Asp Ser Lys He Ser 
1075 1080 " 1085 

Ala Asp Gly His Lys Val Thr Leu His Ser Lys Val Glu Thr Ser Gly 
1090 1095 1100 

Ser Asn Asn Asn Thr Glu Asp Ser Ser Asp Asn Asn Ala Gly Leu Thr 
H05 1110 ins ~ 1120 

He Asp Ala Lys Asn Val Thr Val Asn Asn Asn He Thr Ser His Lys 
1125 1130 H35 

Ala Val Ser He Ser Ala Thr Ser Gly Glu He Thr Thr Lys Thr Gly 
1140 1145 1150 

Thr Thr He Asn Ala Thr Thr Gly Asn Val Glu He Thr Ala Gin Thr 
1155 1160 1165 

Gly Ser He Leu Gly Gly He Glu Ser Ser Ser Gly Ser Val Thr Leu 
1170 1175 1180 
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Thr Ala Thr Glu Gly Ala Leu Ala Val Ser Asn He Ser Gly Asn Thr 

3-195 120 

val Thr val Thr Ala^Asn Ser Gly Ala Leu Thr Thr Leu Ala Gly. Ser 

1210 1215 

Thr He Lys GlyThr Glu Ser Val Thr Thr Ser Ser Gin Ser Gly Asp 

1225 12 30 
He Gly Gly Thr He Ser Gly Gly Thr Val Glu Val Lys Ala Thr Glu 

1240 1245 
Ser Leu Thr Thr Gin Ser Asn Ser Lys He Lys Ala Thr Thr Gly Glu 

1255 1260 
AlaAsn Val Thr Ser Ala Thr Gly Thr He Gly Gly Thr He Ser Gly 

1275 1280 
Asn Thr Val Asn Val Thr Ala Asn Ala Gly Asp Leu Thr Val Gly Asn 

1290 1295 
Gly Ala Glu iieAsn Ala Thr Glu Oly Al. Ala Thr Leu Thr Thr Ser 



1310 



Ser Gly Lys Leu Thr Thr Glu Ala Ser Ser His He Thr Ser Ala Lys 

1320 1325 

Gly Gin Val Asn Leu Ser Ala Gin Asp Gly Ser Val Ala Gly Ser He 

1335 1340 

AsnAla Ala Asn Val Thr Leu Asn Thr Thr Gly Thr Leu Thr Thr Val 

1355 1360 
Lys Gly Ser Asn He^Asn Ala Thr Ser Gly Thr Leu Val He Asn Ala 

1370 1375 

Lys Asp Ala Glu Leu Asn Gly Ala Ala Leu Gly Asn His Thr Val Val 

1385 139Q 

Asn Ala Thr Asn Ala Asn Gly Ser Gly Ser Val He Ala Thr Thr Ser 

1400 1405 

Ser ArgVal Asn He Thr Gly Asp Leu He Thr He Asn Gly Leu Asn 

xixs 1420 

IleHe ser Lys Asn GlyHe Asn Thr Val Leu Leu Lys Gly val Lys 

1435 1440 
He Asp Val Lys Tyr lie Gin Pro Gly He Ala Ser Val Asp Glu Val 

1450 14SS 

He Glu Ala Lys Arg He Leu Glu Lys Val Lys Asp Leu Ser Asp Glu 

1465 1470 

Glu Arg Glu Ala Leu Ala Lys Leu Gly Val Ser Ala Val Arg P he He 

1480 1485 

G1U illo^ 11 ^ 116 ^L Val Thr Gln *« G1 " «» Ala Thr 

1495 1500 

Arg^Pro Leu Ser Arg H e Val He Ser Glu Gly Arg Ala Cys Phe Ser 

510 1515 1520 

Asn Ser Asp Gly Ala Thr Val Cys Val Asn He Ala Asp Asn Gly Arg 

1530 1S35 
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(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4937 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE : DNA (genomic) 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 



TAAATATACA 


AGATAATAAA 


AATAAATCAA 


GATTTTTGTG 


ATGACAAACA 


ACAATTACAA 


60 


CACCTTTTTT 


GCAGTCTATA 


TGCAAATATT 


TTAAAAAAAT 


AG T AT AAAT C 


CGCCATATAA 


120 


AATGGTATAA 


TCTTTCATCT 


TTCATCTTTA 


ATCTTTCATC 


TTTCATCTTT 


CATCTTTCAT 


180 


CTTTCATCTT 


TCATCTTTCA 


TCTTTCATCT 


TTCATCTTTC 


ATCTTTCATC 


TTTCATCTTT 


240 


CACATGAAAT 


GATGAACCGA 


GGGAAGGGAG 


GGAGGGGCAA 


GAATGAAGAG 


GGAGCTGAAC 


300 


GAACGCAAAT 


GATAAAGTAA 


TTTAATTGTT 


CAACTAACCT 


TAGGAGAAAA 


TATGAACAAG 


360 


ATATATCGTC 


TCAAATTCAG 


CAAACGCCTG 


AATG CTTTGG 


TTGCTGTGTC 


TGAATTGGCA 


420 


CGGGGTTGTG 


ACCATTCCAC 


AGAAAAAGGC 


TTCCGCTATG 


TTACTATCTT 


TAGGTGTAAC 


480 


CACTTAGCGT 


TAAAGCCACT- TTCCGCTATG 


TTACTATCTT 


TAGGTGTAAC 


ATCTATTCCA 


540 


CAATCTGTTT 


TAGCAAGCGG 


CTTACAAGGA 


ATGGATGTAG 


TACACGGCAC 


AGCCACTATG 


600 


CAAGTAGATG 


GTAATAAAAC 


CATTATCCGC 


AACAGTGTTG 


ACG CTATC AT 


TAATTGGAAA 


660 


CAATTTAACA 


TCGACCAAAA 


TGAAATGGTG 


CAGTTTTTAC 


AAGAAAACAA 


CAACTCCGCC 


720 


GTATTCAACC 


GTGTTACATC 


TAACCAAATC 


TCCCAATTAA 


AAGGGATTTT 


AGATTCTAAC 


780 


GGACAAGTCT 


TTTTAATCAA 


CCCAAATGGT 


ATCACAATAG 


GTAAAGACGC 


AATTATTAAC 


840 


ACTAATGGCT 


TTACGGCTTC 


TACGCTAGAC 


ATTTCTAACG 


AAAACATCAA 


GGCGCGTAAT 


900 


TTCACCTTCG 


AGCAAACCAA 


AGATAAAGCG 


CTCGCTGAAA 


TTGTGAATCA 


CGGTTTAATT ' 


960 


ACTGTCGGTA 


AAGACGGCAG 


TGTAAATCTT 


ATTGGTGGCA 


AAGTGAAAAA 


CGAGGGTGTG 


1020 


ATTAGCGTAA 


ATGGTGGCAG 


CATTTCTTTA 


CTCGCAGGGC 


AAAAAATCAC 


CATCAGCGAT 


1080 


ATAATAAACC 


CAACCATTAC 


TTACAGCATT 


GCCGCGCCTG 


AAAATGAAGC 


GGTCAATCTG 


1140 


GGCGATATTT 


TTGCCAAAGG 


CGGTAACATT 


AATGTCCGTG 


CTGCCACTAT 


TCGAAACCAA 


1200 


GGTAAACTTT 


CTGCTGATTC 


TGTAAGCAAA 


GATAAAAGCG 


GCAATATTGT 


TCTTTCCGCC 


1260 


AAAGAGGGTG 


AAGCGGAAAT 


TGGCGGTGTA 


ATTTCCGCTC 


AAAATCAGCA 


AGCTAAAGGC 


1320 


GGCAAGCTGA 


TGATTACAGG 


CGATAAAGTC 


ACATTAAAAA 


CAGGTGCAGT 


TATCGACCTT 


1380 


TCAGGTAAAG AAGGGGGAGA AACTTACCTT GGCGGTGACG 


AGCGCGGCGA AGGTAAAAAC 


1440 


GGCATTCAAT 


TAGCAAAGAA 


AACCTCTTTA 


GAAAAAGGCT 


CAACCATCAA 


TGTATCAGGC 


1500 


AAAGAAAAAG 


GCGGACGCGC 


TATTGTGTGG 


GGCGATATTG 


CGTTAATTGA 


CGGCAATATT 


1560 


AACGCTCAAG 


GTAGTGGTGA 


TATCGCTAAA 


ACCGGTGGTT 


TTGTGGAGAC 


ATCGGGGCAT 


1620 


TATTTATCCA 


TTGACAGCAA 


TGCAATTGTT 


AAAACAAAAG 


AGTGGTTGCT 


AGACCCTGAT 


1680 



72 

GATGTAACAA TTGAAGCCGA AGACCCCCTT CGCAATAATA CCGGTATAAA TGATGAATTC 
CCAACAGGCA CCGGTGAAGC AAGCGACCCT AAAAAAAATA GCGAACTCAA AACAACGCTA 
ACCAATACAA CTATTTCAAA TTATCTGAAA AACGCCTGGA CAATGAATAT AACGGCATCA 
AGAAAACTTA CCGTTAATAG CTCAATCAAC ATCGGAAGCA ACTCCCACTT AATTCTCCAT 
AGTAAAGGTC AGCGTGGCGG AGGCGTTCAG ATTGATGGAG ATATTACTTC TAAAGGCGGA 
AATTTAACCA TTTATTCTGG CGGATGGGTT GATGTTCATA AAAATATTAC GCTTGATCAG 
GGTTTTTTAA ATATTACCGC CGCTTCCGTA GCTTTTGAAG GTGGAAATAA CAAAGCACGC 
GACGCGGCAA ATGCTAAAAT TGTCGCCCAG GGCACTGTAA CCATTACAGG AGAGGGAAAA 
GATTTCAGGG CTAACAACGT ATCTTTAAAC GGAACGGGTA AAGGTCTGAA TATCATTTCA 
TCAGTGAATA ATTTAACCCA CAATCTTAGT GG CACAATT A ACATATCTGG GAATATAACA 
ATTAAC C AAA CTACGAGAAA GAACACCTCG TATTGGCAAA CCAGCCATGA TTCGCACTGG 
AACGTCAGTG CTCTTAATCT AG AG AC AGG C GCAAATTTTA CCTTTATTAA ATACATTTCA 
AGCAATAGCA AAGGCTTAAC AACACAGTAT AGAAGCTCTG CAGGGGTGAA TTTTAACGGC 
G TAAATGG C A ACATGTCATT CAATCTCAAA GAAGGAGCGA AAGTTAATTT CAAATTAAAA 
C C AAACG AG A ACATGAACAC AAGCAAACCT TTACCAATTC GGTTTTTAGC CAATATCACA 
GCCACTGGTG GGGGCTCTGT TTTTTTTGAT ATATATGCCA ACCATTCTGG CAGAGGGGCT 
GAGTTAAAAA TGAGTGAAAT TAATATCTCT AACGGCGCTA ATTTTACCTT AAATTCCCAT 
GTTCGCGGCG ATGACGCTTT TAAAATCAAC AAAGACTTAA CCATAAATGC AACCAATTCA 
AATTTCAGCC TCAGACAGAC GAAAGATGAT TTTTATGACG GGTACGCACG CAATGCCATC 
AATTCAACCT ACAACATATC CATTCTGGGC GGTAATGTCA CCCTTGGTGG ACAAAACTCA 
AG C AG C AGC A TTACGGGGAA TATTACTATC GAGAAAGCAG CAAATGTTAC GCTAGAAGCC 
AATAACGCCC CTAATCAGCA AAACATAAGG GATAGAGTTA TAAAACTTGG CAGCTTGCTC 
GTTAATGGGA GTTTAAGTTT AACTGGCGAA AATGCAGATA TTAAAGGCAA TCTCACTATT 
TCAGAAAGCG CCACTTTTAA AGGAAAGACT AGAGATACCC TAAATATCAC CGGCAATTTT 
ACCAATAATG GCACTGCCGA AATTAATATA ACACAAGGAG TGGTAAAACT TGGCAATGTT 
ACCAATGATG GTGATTTAAA CATTACCACT CACGCTAAAC GCAACCAAAG AAGCATCATC 
GGCGGAGATA TAATCAACAA AAAAGGAAGC TTAAATATTA CAGACAGTAA TAATGATGCT 
GAAATCCAAA TTGGCGGCAA TATCTCGCAA AAAGAAGGCA ACCTCACGAT TTCTTCCGAT 3360 
AAAATTAATA TCACCAAACA GATAACAATC AAAAAGGGTA TTGATGGAGA GGACTCTAGT 3420 
TCAGATGCGA CAAGTAATGC CAACCTAACT ATTAAAACCA AAGAATTGAA ATTGACAGAA 
GACCTAAGTA TTTCAGGTTT CAATAAAGCA GAGATTACAG CCAAAGATGG TAGAGATTTA 
ACTATTGGCA ACAGTAATGA CGGTAACAGC GGTGCCGAAG CCAAAACAGT AACTTTTAAC 
AATGTTAAAG ATTCAAAAAT CTCTGCTGAC GGTCACAATG TGACACTAAA TAGCAAAGTG 
AAAACATCTA GCAGCAATGG CGGACGTGAA AGCAATAGCG ACAACGATAC CGGCTTAACT 
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ATTACTGCAA AAAATGTACA AGTAAACAAA CA T AT T AG„ CTCTCAAAAC AGTAAATATC 
ACGGGGTCGG AAAAGGTTAC GACCACAGCA GGCTCGACCA TTAACGCAAC AAATGGCAAA 
ggaagta™ GAAGGAAAAG AGGTGATATC AGCGGXAGGA TT.GGGGTAA GACGGTAAGT 
GTTAGCGCGA CTGGTGATTT AAGGAGTAAA TCCGGCTCAA AAATTGAAGC GAAATCGGGT 
GAGGCTAATG TAAGAAGTGG AAGAGGTAGA ATTGGCGGTA CAATTTCCGG TAATACGGTA 
AATGTTAGGG GAAAGGGTGG GGATTTAAGA GTTGGGAATG GCGCAGAAAT TAATGGGAGA 
GAAGGAGCTG GAAGGTTAAG CGGAAGAGGG AATACCTTGA CTACTGAAGC CGGTTCTAGC 
ATCACTTCAA GTAAGGGTCA GGTAGACCTC TTGGGTGAGA ATGGTAGCAT GGCAGGAAGG 
ATTAATGCTG GTAATGTGAG ATTAAATACT ACAGGCACCT TAAGGAGCGT GGCAGGCTCG 
GATATTAAAG GAAGCAGGGG GAGCTTGGTT ATTAACGCAA AAGATGCTAA GGTAAATGGT 
GATGCATCAG GTGATAGTAC AGAAGTGAAT GGAGTGAAGG CAAGCGGCTC TGGTAGTGTG 
ACTGCGGGAA GGTGAAGGAG TGTGAATATC AGTGGGGATT TAAACACAGT AAATGGGTTA 
AATATCATTT CGAAAGATGG TAGAAACACT GTGCGCTTAA GAGGCAAGGA AATTGAGGTG 
AAATATATCC AGCGAGGTGX AGCAAGTGTA GAAGAAGTAA TTGAAGCGAA ACGCGTCCTT 
GAAAAAGTAA AAGATTTATC TGATGAAGAA AGAGAAACAT TAGCTAAACT TGGTGTAAGT 
GCTGTACGTT TTGTTGAGCC AAATAATACA ATTACAGTCA ATACACAAAA TGAATTTACA 
ACCAGACCGT GAAGTCAAGT GATAATTTCT GAAGGTAAGG GGTGTTTCTC AAGTGGTAAT 
GGGGCAGGAG TATGTACCAA TCTTGCTCAC GATGGACAGC CGTAGTCAGT AATTGAGAAG 
GTAGATTTGA TCCTGCAATG AAGTCATTTT ATTTTCGTAT TATTTACTGT GTGGGTTAAA 
GTTCAGTACG GGCTTTACCC ATCTTGTAAA AAATTACGGA GAATACAATA AAGTATTTTT 
AACAGGTTAT TATTATG 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS- 

™ NGTH: 1477 a ™ino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY • linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Asn Lys lie Tyr Arg Leu Lys Ph e Ser Lys Arg Leu Asn Ala Leu 
Val Ala Val Ser Glu Leu Ala Arg cly cys Asp His ser Thr Glu Lys 
Oly Ser Glu Lys Pro Ala Ar g Met. Lys Val Arg His Leu 2 Leu Lys 
Pro Leu Ser Ala Met Leu Leu Ser Leu Gly Val Thr Ser He Pro Gin 
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Ser Val Leu Ala Ser Gly Leu Gin Gly Met Asp Val Val His Gly Thr 



70 ?S - 80 



Ala Thr Met Gin Val Asp Gly Asn Lys Thr He He Arg Asn Ser Val 
85 90 ~ 9s 

Asp Ala He lie Asn Trp Lys Gin Phe Asn He Asp Gin Asn Glu Met 
100 105 110 

Val Gin Phe Leu Gin Glu Asn Asn Asn Ser Ala Val Phe Asn Arg Val 
U5 120 



12S 



Thr Ser Asn Gin He Ser Gin Leu Lys Gly He Leu Asp Ser Asn Gly 

135 140 

Gin Val Phe Leu He Asn Pro Asn Gly He Thr He Gly Lys Asp Ala 

1S0 155 160 

He He Asn Thr Asn Gly Phe Thr Ala Ser Thr Leu Asp He Ser Asn 

165 170 175 

Glu Asn He Lys Ala Arg Asn Phe Thr Phe Glu Gin Thr Lys Asp Lys 



Ala Leu Ala Glu He Val Asn His Gly Leu He Thr Val Gly Lys Asp 
195 200 205 

Gly Ser Val Asn Leu He Gly Gly Lys Val Lys Asn Glu Gly Val He 



Ser Val Asn Gly Gly Ser He Ser Leu Leu Ala Gly Gin Lys He Thr 

230 235 ' 240 

He Ser Asp He lie Asn Pro Thr He Thr Tyr Ser He Ala Ala Pro 

245 250 255 

Glu Asn Glu Ala Val Asn Leu Gly Asp He Phe Ala Lys Gly Gly Asn 

^bu 2 65 - - 



270 



He Asn Val Arg Ala Ala Thr He Arg Asn Gin Gly Lys Leu Ser Ala 
2/5 280 285 * . 

Asp Ser Val Ser Lys Asp Lys Ser Gly Asn lie Val Leu Ser Ala Lys 
290 295 300 ^ 

Glu Gly Glu Ala Glu lie Gly Gly Val He Ser Ala Gin Asn Gin Gin 
5 310 315 320 

Ala Lys Gly Gly Lys Leu Met He Thr Gly Asp Lys Val Thr Leu Lys 
32S 330 33s 

Thr Gly Ala Val He Asp Leu Ser Gly Lys Glu Gly Gly Glu Thr Tyr 
340 345 350 

Leu Gly Gly Asp Glu Arg Gly Glu Gly Lys Asn Gly He Gin Leu Ala 
■* 1>5 360 365 

Lys Lys Thr Ser Leu Glu Lys Gly Ser Thr He Asn Val Ser Gly Lys 



380 



Glu Lys Gly Gly Phe Ala He Val Trp Gly Asp He Ala Leu He Asp 
385 390 395 40 S 

Gly Asn He Asn Ala Gin Gly Ser Gly Asp He Ala Lys Thr Gly Glv 
405 410 * 43.5 3 
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Phe val Glu Thr Ser Gly His Asp Leu Phe He Lys Asp Asn Ala lie 

20 425 430 

Val Asp Ala Lys Glu Trp Leu Leu Asp Phe Asp Asn Val Ser He Asn 
435 44 ° 44S 



Ala Glu Asp Pro Leu Phe Asn Asn Thr Gly He Asn Asp Glu Phe Pro 

4 5 5 460 

Thr Gly Thr Gly Glu Ala Ser Asp Pro Lys Lys Asn Ser Glu Leu Lys 

4 / (J / tc 

4/5 480 

Thr Thr Leu Thr Asn Thr Thr He Ser Asn Tyr Leu Lys Asn Ala Trp 
485 490 495 H 

Thr Met Asn lie Thr Ala Ser Arg Lys Leu Thr Val Asn Ser Ser He 

Asn lie Gly Ser Asn Ser His Leu He Leu His Ser Lys Gly Gin Arg 

S20 a 

Gly Gly Gly Val Gin He Asp Gly Asp He Thr Ser Lys Gly Gly Asn 

535 540 
Leu Thr. He Tyr Ser Gly Gly Trp Val Asp Val His Lys Asn He Thr 



55S 



560 



Leu Asp Gin Gly Phe Leu Asn He Thr Ala Ala Ser Val Ala Phe Glu 
S6S 570 575 

Gly Gly Asn Asn Lys Ala Arg Asp Ala Ala Asn Ala Lys He Val Ala 

585 590 

Gin Gly Thr Val Thr He Thr Gly Glu Gly Lys Asp Phe Arg Ala Asn 



605 



Asn val Ser Leu Asn Gly Thr Gly Lys Gly Leu Asn' lie He Ser Ser 

615 620 



Val Asn Asn Leu Thr His Asn Leu Ser Gly Thr He Asn He Ser Gly 

630 635 64 £ 

Asn He Thr He Asn Gin Thr Thr Arg Lys Asn Thr Ser Tyr Trp Gin 



655 



Thr Ser His Asp Ser His Trp Asn Val Ser Ala Leu Asn Leu Glu Thr 

665 67 0 

Gly Ala Asn Phe Thr Phe He Lys Tyr He Ser Ser Asn Ser Lys Gly 

68 0 535 

Leu Thr Thr Gin Tyr Arg Ser Ser Ala Gly Val Asn Phe Asn Gly Val 

695 700 

Asn Gly Asn Met Ser Phe Asn Leu Lys Glu Gly Ala Lys Val Asn Phe 

710 7 1S 7 20 

Lys Leu Lys Pro Asn Glu Asn Met Asn Thr Ser Lys Pro Leu Pro He 
725 730 735 

Arg Phe Leu Ala Asn He Thr Ala Thr Gly Gly Gly Ser Val Phe Phe 
740 745 7so 

Asp He Tyr Ala Asn His Ser Gly Arg Gly Ala Glu Leu Lys Met Ser 
/:5b 760 765 



76. 



Glu He Asn He Ser Asn Gly Ala Asn Phe Thr Leu Asn Ser His Val 
770 775 780 

Arg Gly Asp Asp Ala Phe Lys He Asn Lys Asp Leu Thr He Asn Ala 
785. 790 795 800 

Thr Asn Ser Asn Phe Ser Leu Arg Gin Thr Lys Asp Asp Phe Tyr Asp 
80S 810 815 

Gly Tyr Ala Arg Asn Ala He Asn Ser Thr Tyr Asn He Ser He Leu 
820 825 * 830 

Gly Gly Asn Val Thr Leu Gly Gly Gin Asn Ser Ser Ser Ser He Thr 
835 840 845 

Gly Asn He Thr He Glu Lys Ala Ala Asn Val Thr Leu Glu Ala Asn 
850 855 860 

Asn Ala Pro Asn Gin Gin Asn He Arg Asp Arg Val He Lys Leu Gly 
865 870 875 880 

Ser Leu Leu Val Asn Gly Ser Leu Ser Leu Thr Gly Glu Asn Ala Asp 
885 890 ' 895 

He Lys Gly Asn Leu Thr He Ser Glu Ser Ala Thr Phe Lys Gly Lys 
900 905 910 

Thr Arg Asp Thr Leu Asn He Thr Gly Asn Phe Thr Asn Asn Gly Thr 
915 920 925 

Ala Glu He Asn He Thr Gin Gly Val Val Lys Leu Gly Asn Val Thr 
930 935 • 940 

Asn Asp Gly Asp Leu Asn He Thr Thr His Ala Lys Arg Asn Gin Arg 
945 950 955 960 

Ser He He Gly Gly Asp He He Asn Lys Lys Gly Ser Leu Asn lie 
965 970 975 

Thr Asp Ser Asn Asn Asp Ala Glu He Gin He Gly Gly Asn He Ser 
980 985 * 990 

Gin Lys Glu Gly Asn Leu Thr He Ser Ser Asp Lys He Asn He Thr 
995 1000 1005 

Lys Gin He Thr He Lys Lys Gly He Asp Gly Glu Asp Ser Ser Ser 
1010 1015 1020 

Asp Ala Thr Ser Asn Ala Asn Leu Thr He Lys Thr Lys Glu Leu Lys 
1025 1030 1035 1040 

Leu Thr Glu Asp Leu Ser He Ser Gly Phe Asn Lys Ala Glu He Thr 
1045 1050 1055 

Ala Lys Asp Gly Arg Asp Leu Thr lie Gly Asn Ser Asn Asp Gly Asn 
1060 . 1065 1070 

Ser Gly Ala Glu Ala Lys Thr Val Thr Phe Asn Asn Val Lys Asp Ser 
1075 1080 1085 

Lys He Ser Ala Asp Gly His Asn Val Thr Leu Asn Ser Lys Val Lys 
1090 1095 HOO 

Thr Ser Ser Ser Asn Gly Gly Arg Glu Ser Asn Ser Asp Asn Asp Thr 
1105 1110 HIS H20 
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Gly Leu Thr lie Thr Ala Lys Asn Val Glu Val Asn Lys Aso lie Thr 
1125 H30 1135 

Ser Leu Lys Thr Val Asn lie Thr Ala Ser Glu Lys Val Thr Thr Thr 
1140 1145 J 1150 

Ala Gly Ser Thr lie Asn Ala Thr Asn Gly Lys Ala Ser lie Thr Thr 
1155 1160 1165 

Lys Thr Gly Asp. lie Ser Gly Thr lie Ser Gly Asn Thr Val Ser Val 
1170 1175 1180 

Ser Ala Thr Val Asp Leu Thr Thr Lys Ser Gly Ser Lys lie Glu Ala 
1185 1190 1195 ' 1200 

Lys Ser Gly Glu Ala Asn Val Thr Ser Ala Thr Gly Thr lie Gly Gly 
1205 1210 1215 

Thr lie Ser Gly Asn Thr Val Asn Val Thr Ala Asn Ala Gly Asp Leu 
1220 1225 1230 

Thr Val Gly Asn Gly Ala Glu lie Asn Ala Thr Glu Gly Ala Ala Thr 
1235 1240 1245 

Leu Thr Ala Thr Gly Asn Thr Leu Thr Thr Glu Ala Gly Ser Ser lie 
1250 1255 1260 

Thr Ser Thr Lys Gly Gin Val Asp Leu Leu Ala Gin Asn Gly Ser lie 
1265 1270 1275 1280 

Ala Gly Ser He Asn Ala Ala Asn Val Thr Leu Asn Thr Thr Gly Thr 
1285 1290 1295 

Leu Thr Thr Val Ala Gly Ser Asp He Lys Ala Thr Ser Gly Thr Leu 
1300 1305 1310 

Val He Asn Ala Lys Asp Ala Lys Leu Asn Gly Asp Ala Ser Gly Asp 
1315 1320 1325 

Ser Thr Glu Val Asn Ala Val Asn Ala Ser Gly Ser Gly Ser Val Thr 
1330 1335 1340 

Ala Ala Thr Ser Ser Ser Val Asn He Thr Gly Asp Leu Asn Thr Val 
1345 1350 1355 1360 

Asn Gly Leu Asn He He Ser Lys Asp Gly Arg Asn Thr Val Arg Leu 
1365 1370 1375 

Arg Gly Lys Glu lie Glu Val Lys Tyr He Gin Pro Gly Val Ala Ser 
1380 1385 1390 

Val Glu Glu Val He Glu Ala Lys Arg Val Leu Glu Lys Val Lys Asp 
1395 1400 1405 

Leu Ser Asp Glu Glu Arg Glu Thr Leu Ala Lys Leu Gly Val Ser Ala 
1410 1415 1420 

Val Arg Phe Val Glu Pro Asn Asn Thr He Thr Val Asn Thr Gin Asn 
1425 1430 1435 1440 

Glu Phe Thr Thr Arg Pro Ser Ser Gin Val He He Ser Glu Gly Lys 
1445 1450 1455 

Ala Cys Phe Ser Ser Gly Asn Gly Ala Arg Val Cys Thr Asn Val Ala 
1460 1465 1470 
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Asp Asp Gly Gin Pro 
1475 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



ACAGCGTTCT 


CTTAATACTA 


GTACAAACCC 


ACAATAAAAT 


ATGACAAACA 


ACAATTACAA 


60 


CACCTTTTTT 


GCAGTCTATA 


TGCAAATATT 


TTAAAAAATA 


GTATAAATCC 


GCCATATAAA 


120 


ATGGTATAAT 


CTTTCATCTT 


TCATCTTTCA 


TCTTTCATCT 


TTCATCTTTC 


ATCTTTCATC 


180 


TTTCATCTTT 


CATCTTTCAT 


CTTTCATCTT 


TCATCTTTCA 


TCTTTCATCT 


TTCATCTTTC 


240 


ACATGAAATG 


ATGAACCGAG 


GGAAGGGAGG 


GAGGGGCAAG 


AATGAAGAGG 


GAGCTGAACG 


300 


AACGCAAATG 


ATAAAGTAAT 


TTAATTGTTC 


AACTAACCTT 


AG G AG AAAAT 


ATGAACAAGA 


360 


TATATCGTCT 


C AAATTC AG C 


AAACGCCTGA 


ATGCTTTGGT 


TGCTGTGTCT 


GAATTGGCAC 


420 


GGGGTTGTGA 


CCATTCCACA 


GAAAAAGGCA 


GCGAAAAACC 


TG'CTCGCATG 


AAAGTGCGTC 


480 


ACTTAGCGTT 


AAAGCCACTT 


TCCGCTATGT 


TACTATCTTT 


AGGTGTAACA 


TCTATTCCAC 


540 


AATCTGTTTT 


AGCAAGCGGC 


TTACAAGGAA 


TGGATGTAGT 


ACACGGCACA 


GCCACTATGC 


600 


AAGTAGATGG 


TAATAAAACC 


ATTATCCGCA 


ACAGTGTTGA 


CGCTATCATT 


AATTGGAAAC 


660 


AATTTAACAT 


CGACCAAAAT 


GAAATGGTGC 


AGTTTTTACA 


AGAAAACAAC 


AACTCCGCCG 


720 


TATTCAACCG 


TGTTACATCT 


AACCAAATCT 


C C CAATT AAA 


AGGGATTTTA 


GATTCTAACG - 


780 


GACAAGTCTT 


TTTAATCAAC 


CCAAATGGTA 


TCACAATAGG 


TAAAGACGCA 


ATTATTAACA 


840 


CTAATGGCTT 


TACGGCTTCT 


ACGCTAGACA 


TTTCTAACGA 


AAACATCAAG 


GCGCGTAATT 


900 


TCACCTTCGA 


GCAAACCAAA 


GATAAAGCGC 


TCGCTGAAAT 


TGTGAATCAC 


GGTTTAATTA 


960 


CTGTCGGTAA 


AGACGGCAGT 


GTAAATCTTA 


TTGGTGGCAA 


AGTGAAAAAC 


GAGGGTGTGA 


1020 


TTAGCGTAAA 


TGGTGGCAGC 


ATTTCTTTAC 


TCGCAGGGCA 


AAAAATCACC 


ATCAGCGATA 


1080 


TAATAAACCC 


AACCATTACT 


TACAGCATTG 


CCGCGCCTGA AAATGAAGCG 


GTCAATCTGG 


1140 


GCGATATTTT 


TGCCAAAGGC 


GGTAACATTA 


ATGTCCGTGC 


TGC CACTATT 


CGAAACCAAG 


1200 


CTTTCCGCCA 


AAGAGGGTGA 


AGCGGAAATT 


GGCGGTGTAA 


TTTCCGCTCA 


AAATCAGCAA 


1260 


GCTAAAGGCG 


GCAAGCTGAT 


GATTACAGGC 


GATAAAGTCA 


CATTAAAAAC 


AGGTGCAGTT 


1320 


ATCGACCTTT 


CAGGTAAAGA 


AGGGGGAGAA 


ACTTACCTTG GCGGTGACGA GCGCGGCGAA 


1380 


GGTAAAAACG 


GCATTCAATT 


AGCAAAGAAA 


ACCTCTTTAG 


AAAAAGGCTC 


AACCATCAAT 


1440 


GTATCAGGCA 


AAGAAAAAGG 


CGGACGCGCT 


ATTGTGTGGG GCGATATTGC GTTAATTGAC 


1500 
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GGCAATATTA ACGCTCAAGG TAGTGGTGAT ATCGCTAAAA CCGGTGGTTT TGTGGAGACG 
TCGGGGCATG ATTTATTCAT CAAAGACAAT GCAATTGTTG ACGCCAAAGA GTGGTTGTTA 
GACCCGGATA ATGTATCTAT TAATGCAGAA ACAGCAGGAC GCAGCAATAC TTCAGAAGAC 
GATGAATACA CGGGATCCGG GAATAGTGCC AGCACCCCAA AACGAAACAA AGAAAAGACA 
ACATTAACAA ACACAACTCT TGAGAGTATA CTAAAAAAAG GTACCTTTGT TAACATCACT 
GCTAATCAAC GCATCTATGT CAATAGCTCC ATTAATTTAT CCAATGGCAG CTTAACTCTT 
TGGAGTGAGG GTCGGAGCGG TGGCGGCGTT GAGATTAACA ACGATATTAC CACCGGTGAT 
GATACCAGAG GTGCAAACTT AACAATTTAC TCAGGCGGCT GGGTTGATGT TCATAAAAAT 
ATCTCACTCG GGGCGCAAGG TAACATAAAC ATTACAGCTA AACAAGATAT CGCCTTTGAG 
AAAGGAAGCA ACCAAGTCAT T AC AG G TC AA GGGACTATTA CCTCAGGCAA TCAAAAAGGT 
TTTAGATTTA ATAATGTCTC TCTAAACGGC ACTGGCAGCG GACTGCAATT CACCACTAAA 
AGAACCAATA AATACG CT AT CACAAATAAA TTTGAAGGGA CTTTAAATAT TTCAGGGAAA 
GTGAACATCT CAATGGTTTT ACCTAAAAAT GAAAGTGGAT ATGATAAATT CAAAGGACGC 
ACTTACTGGA ATTTAACCTC GAAAGTGGAT ATGATAAATT CAAAGGACGC CCTCACTATT 
GACTCCAGAG GAAGCGATAG TGCAGGCACA CTTACCCAGC CTTATAATTT AAACGGTATA 
TCATTCAACA AAGACACTAC CTTTAATGTT GAACGAAATG CAAGAGTCAA CTTTGACATC 
AAGGCACCAA TAGGGATAAA TAAGTATTCT AGTTTGAATT ACGCATCATT TAATGGAAAC 
ATTTC AG TTT CGGGAGGGGG GAGTGTTGAT TTCACACTTC TCGCCTCATC CTCTAACGTC 
CAAACCCCCG GTGTAGTTAT AAATTCTAAA TACTTTAATG TTTCAACAGG GTCAAGTTTA 
AGATTTAAAA CTTCAGGCTC AACAAAAACT GGCTTCTCAA TAGAGAAAGA TTTAACTTTA 
AATGCCACCG GAGGCAACAT AACACTTTTG CAAGTTGAAG GCACCGATGG AATGATTGGT 
AAAGGCATTG TAG CCAAAAA AAACATAACC TTTGAAGGAG GTAAGATGAG GTTTGGCTCC * 
AGGAAAGCCG TAACAGAAAT CGAAGGCAAT GTTACTATCA ATAACAACGC TAACGTCACT 
CTTATCGGTT CGGATTTTGA CAACCATCAA AAACCTTTAA CTATTAAAAA AGATGTCATC 
ATTAATAGCG GCAACCTTAC CGCTGGAGGC AATATTGTCA ATATAGCCGG AAATCTTACC 
GTTGAAAGTA ACGCTAATTT CAAAG CTATC ACAAATTTCA CTTTTAATGT AGGCGGCTTG 
TTTGACAACA AAGGCAATTC AAATATTTCC ATTGCCAAAG GAGGGGCTCG CTTTAAAGAC 
ATTGATAATT CCAAGAATTT AAGCATCACC ACCAACTCCA GCTCCACTTA CCGCACTATT 
ATAAGCGGCA ATATAACCAA TAAAAACGGT GATTTAAATA TTACGAACGA AGGTAGTGAT 
ACTGAAATGC AAATTGGCGG CGATGTCTCG CAAAAAGAAG GTAATCTCAC GATTTCTTCT 
GACAAAATCA ATATTACCAA ACAGATAACA ATCAAGGCAG GTGTTGATGG GGAGAATTCC 
GATTCAGACG CGACAAACAA TGCCAATCTA ACCATTAAAA CCAAAGAATT GAAATTAACG 
CAAGACCTAA ATATTTCAGG TTTCAATAAA GCAGAGATTA CAGCTAAAGA TGGTAGTGAT 
TTAACTATTG GTAACACCAA TAGTGCTGAT GGTACTAATG CCAAAAAAGT AACCTTTAAC 



1S60 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 



80 

CAGGTTAAAG ATTCAAAAAT CTCTGCTGAC GGTCACAAGG TGACACTACA CAGCAAAGTG 36 00 

GAAACATCCG GTAGTAATAA CAACACTGAA GATAGCAGTG ACAATAATGC CGGCTTAACT 36 6 0 

ATCGATGCAA AAAATGTAAC AGTAAACAAC AATATTACTT CTCACAAAGC AGTGAGCATC 3 720 

TCTGCGACAA GTGGAGAAAT T AC C ACTAAA ACAGGTACAA CCATTAACGC AACCACTGGT 3 7 80 

AACGTGGAGA TAACCGCTCA AACAGGTAGT ATCCTAGGTG GAATTGAGTC CAGCTCTGGC 3 84 0 

TCTGTAACAC TTACTGCAAC CGAGGGCGCT CTTGCTGTAA GCAATATTTC GGGCAACACC 3 900 

GTTACTGTTA CTGCAAATAG CGGTGCATTA ACCACTTTGG CAGGCTCTAC AATTAAAGGA 3 9 60 

ACCGAGAGTG TAACCACTTC AAGTCAATCA GGCGATATCG GCGGTACGAT TTCTGGTGGC 4 020 

ACAGTAGAGG TTAAAGCAAC CGAAAGTTTA ACCACTCAAT C C AATTC AAA AATTAAAGCA 4 08 0 

ACAACAGGCG AGGCTAACGT AACAAGTGCA ACAGGTACAA TTGGTGGTAC GATTTCCGGT 4140 

AATACGGTAA ATGTTACGGC AAACGCTGGC GATTTAACAG TTGGGAATGG CGCAGAAATT 42 00 

AATG CG AC AG AAGGAGCTGC AACCTTAACT ACATCATCGG GCAAATTAAC TACCGAAGCT 42 6 0 

AGTTCACACA TTACTTCAGC CAAGGGTCAG GTAAATCTTT CAGCTCAGGA TGGTAG CGTT 4 320 

GCAGGAAGTA TTAATGCCGC CAATGTGACA CTAAATACTA CAGGCACTTT AACTACCGTG 4 38 0 

AAGGGTTCAA ACATTAATGC AACC AG CGGT ACCTTGGTTA TTAACGCAAA AGACGCTGAG 444 0 

CTAAATGGCG CAGCATTGGG TAACCACACA GTGGTAAATG CAACCAACGC AAATGGCTCC 4 500 

GGCAGCGTAA TCGCGACAAC CTCAAGCAGA GTGAACATCA CTGGGGATTT AATCACAATA 4 560 

AATGGATTAA ATATCATTTC AAAAAACGGT AT AAACAC CG TACTGTTAAA AGG CGTT AAA 4 62 0 

ATTGATGTGA AATACATTCA ACCGGGTATA GCAAGCGTAG ATGAAGTAAT TGAAGCGAAA 4680 

CGCATCCTTG AGAAGGTAAA AGATTTATCT GATGAAGAAA GAGAAGCGTT AGCTAAACTT 4 74 0 

GGCGTAAGTG CTGTACGTTT TATTGAGCCA AATAATACAA TTACAGTCGA TACACAAAAT 48 00 

GAATTTGCAA CCAGACCATT AAGTCGAATA GTGATTTCTG AAGGCAGGGC GTGTTTCTCA. 4860 

AACAGTGATG GCGCGACGGT GTGCGTTAAT ATCGCTGATA ACGGGCGGTA GCGGTCAGTA .4 920 

ATTGACAAGG TAGATTTCAT CCTGCAATGA AGTCATTTTA TTTTCGTATT ATTTACTGTG 4 980 

TGGGTTAAAG TTCAGTACGG GCTTTACCCA TCTTGTAAAA AATTACGGAG AATACAATAA 5040 

AGTATTTTTA ACAGGTTATT ATTATGAAAA ATATAAAAAG CAGATTAAAA CTCAGTGCAA 5100 

TATCAGTATT GCTTGGCCTG GCTTCTTCAT CATTGTATGC AGAAGAAGCG TTTTTAGTAA 5160 

AAGGCTTTCA GTTATCTGGT GCACTTGAAA CTTTAAGTGA AGACGCCCAA CTGTCTGTAG 5220 

CAAAATCTTT ATCTAAATAC CAAGGCTCGC AAACTTTAAC AAACCTAAAA ACAGCACAGC 5280 

TTGAATTACA GGCTGTGCTA GATAAGATTG AGCCAAATAA GTTTGATGTG ATATTGCCAC 5340 

AACAAACCAT TACGGATGGC AATATTATGT TTGAGCTAGT CTCGAAATCA GCCGCAGAAA 5400 

GCCAAGTTTT TTATAAGGCG AGCCAGGGTT ATAGTGAAGA AAATATCGCT CGTAGCCTGC 5460 

CATCTTTGAA ACAAGGAAAA GTGTATGAAG ATGGTCGTCA GTGGTTCGAT TTGCGTGAAT 5520 

TCAATATGGC AAAAGAAAAT CCACTTAAAG TCACTCGCGT GCATTACGAG TTAAACCCTA 5580 
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AAAACAAAAC 


CTCTGATTTG 


G TAG TTG CAG 


GTTTTTCGCC 


TTTTGGCAAA 


ACGCGTAGCT 


5640 


TTGTTTCCTA 


TGATAATTTC 


GGCGCAAGGG 


AGTTTAACTA 


TCAACGTGTA 


AGTCTAGGTT 


5700 


TTGTAAATGC 


C AATTTG AC C 


GGACATGATG 


ATGTATTAAA 


TCTAAACGCA 


TTG AC C AATG . 


5760 


TAAAAGCACC 


ATCAAAATCT 


TATGCGGTAG 


G C AT AGG AT A 


TACTTATCCG 


TTTTATGATA 


5820 


AACACCAATC 


CTTAAGTCTT 


TATACCAGCA 


TGAGTTATGC 


TGATTCTAAT 


GATATCGACG 


5880 


GCTTACCAAG 


TGCGATTAAT 


CGTAAATTAT 


CAAAAGGTCA 


ATCTATCTCT 


GCGAATCTGA 


5940 


AATGGAGTTA 


TTATCTCCCG 


ACATTTAACC 


TTGGAATGGA 


AGACCAGTTT 


AAAATTAATT 


6000 


TAGGCTACAA 


CTACCGCCAT 


ATTAATCAAA 


CATCCGAGTT 


AAACACCCTG 


GGTGCAACGA 


6060 


AGAAAAAATT 


TGCAGTATCA 


GGCGTAAGTG 


CAGGCATTGA 


TGGACATATC 


CAATTTACCC 


6120 


CTAAAACAAT 


CTTTAATATT 


GATTTAACTC 


ATCATTATTA 


CG CGAGTAAA 


TTACCAGGCT 


6180 


CTTTTGGAAT 


GGAGCGCATT 


GGCGAAACAT 


TTAATCGCAG 


CTATCACATT 


AGCACAGCCA 


6240 


GTTTAGGGTT 


GAGTCAAGAG 


TTTGCTCAAG 


GTTGGCATTT 


TAG CAGTC AA 


TTATCGGGTC 


6300 


AGTTTACTCT 


ACAAGATATA 


AGTAGCATAG . 


ATTTATTCTC 


TGTAACAGGT 


ACTTATGGCG 


6360 


TCAGAGGCTT 


TAAATACGGC 


GGTGCAAGTG 


GTGAGCGCGG 


TCTTGTATGG 


CGTAATGAAT 


6420 


TAAGTATGCC 


AAAAT AC AC C 


CGCTTTCAAA 


TCAGCCCTTA 


TGCGTTTTAT 


GATGCAGGTC 


6480 


AGTTCCGTTA 


TAATAGCGAA 


AATGCTAAAA 


CTTACGGCGA 


AGATATGCAC 


ACGGTATCCT 


6540 


CTGCGGGTTT 


AGGCATTAAA 


ACCTCTCCTA 


CACAAAACTT 


AAG CTT AG AT 


GCTTTTGTTG 


6600 


CTCGTCGCTT 


TGCAAATGCC 


AATAGTGACA 


ATTTGAATGG 


CAACAAAAAA 


CGCACAAGCT 


6660 


CACCTACAAC 


CTTCTGGGGT 


AGATTAACAT 


TCAGTTTCTA 


ACCCTGAAAT 


TTAATCAACT 


6720 


GGTAAGCGTT 


CCGCCTACCA 


GTTTATAACT 


ATATGCTTTA 


CCCGCCAATT 


TACAGTCTAT 


6780 


ACGCAACCCT 


GTTTTCATCC 


TTATATATCA 


AACAAACTAA 


GCAAACCAAG 


CAAACCAAGC 


6840 


AAACCAAGCA 


AACCAAGCAA 


ACCAAGCAAA 


CCAAGCAAAC 


CAAGCAAACC 


AAGCAAACCA . 


6900 


AGCAAACCAA 


GCAAACCAAG 


CAAACCAAGC 


AAACCAAGCA 


ATG CTAAAAA 


ACAATTTATA 


6960 


TGATAAACTA 


AAACATACTC 


CAT AC CATGG 


CAATACAAGG 


GATTTAATAA 


TATGACAAAA 


7020 


GAAAATTTAC 


AAAGTGTTCC 


ACAAAATACG 


ACCGCTTCAC 


TTGTAGAATC 


AAACAACGAC 


7080 


CAAACTTCCC 


TGCAAATACT 


TAAACAACCA 


CCCAAACCCA 


ACCTATTACG 


CCTGGAACAA 


7140 


CATGTCGCCA 


AAAAAGATTA 


TGAGCTTGCT 


TGCCGCGAAT 


TAATGGCGAT 


TTTGGAAAAA 


7200 


ATGGACGCTA 


ATTTTGGAGG 


CGTTCACGAT 


ATTG AATTTG 


ACGCACCTGC 


TCAGCTGGCA 


7260 


TATCTACCCG 


AAAAACTACT 


AATTCATTTT 


GCCACTCGTC 


TCGCTAATGC 


AATTACAACA 


7320 


CTCTTTTCCG 


ACCCCGAATT 


GGCAATTTCC 


GAAGAAGGGG 


CATTAAAGAT 


GATTAGCCTG 


7380 


CAACG CTGGT 


TGACGCTGAT 


TTTTGCCTCT 


TCCCCCTACG 


TTAACGCAGA 


CCATATTCTC 


7440 


AATAAATATA 


ATATCAACCC 


AGATTCCGAA 


GGTGGCTTTC 


ATTTAGCAAC 


AGACAACTCT 


7500 


TCTATTGCTA 


AATTCTGTAT 


TTTTTACTTA 


CCCGAATCCA 


ATGTCAATAT 


GAGTTTAGAT 


7560 


GCGTTATGGG 


CAGGGAATCA 


ACAACTTTGT 


GCTTCATTGT 


GTTTTGCGTT 


GCAGTCTTCA 


7620 



7740 
7800 
7860 
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CGTTTTATTG GTACTGCATC TGCGTTTCAT AAAAGAGCGG TGGTTTTACA GTGGTTTCCT 7680 
AAAAAACTCG CCGAAATTGC TAATTTAGAT GAATTGCCTG CAAATATCCT TCATGATGTA 
T AT ATG C ACT G C AGTTATG A TTTAGCAAAA AACAAGCACG ATGTTAAGCG TCCATTAAAC 
GAACTTGTCC GCAAGCATAT CCTCACGCAA GGATGGCAAG ACCGCTACCT TTACACCTTA 

GGTAAAAAGG ACGGCAAACC TGTGATGATG GTACTGCTTG AACATTTTAA TTCGGGACAT 7 92 0 

TCGATTTATC GCACGCATTC AACTTCAATG ATTGCTGCTC GAGAAAAATT CTATTTAGTC 7 98 0 

GGCTTAGGCC ATGAGGGCGT TGATAACATA GGTCGAGAAG TGTTTGACGA GTTCTTTGAA 8 04 0 

ATCAGTAGCA ATAATATAAT GGAGAGACTG TTTTTTATCC GTAAACAGTG CGAAACTTTC 810 0 

CAACCCGCAG TGTTCTATAT GCCAAGCATT GGCATGGATA TTAGCACGAT TTTTGTGAGC 816 0 

AACACTCGGC TTGCCCCTAT TCAAGCTGTA GCCTTGGGTC ATCCTGCCAC TACGCATTCT 8220 

GAATTTATTG ATTATGTCAT CGTAGAAGAT GATTATGTGG GCAGTGAAGA TTGTTTTAGC 8280 

GAAACCCTTT TACGCTTACC CAAAGATGCC CTACCTTATG TACCATCTGC ACTCGCCCCA 8 340 

CAAAAAGTGG ATTATGTACT CAGGGAAAAC CCTGAAGTAG TCAATATCGG TATTGCCGCT 8400 

ACCACAATGA AATTAAACCC TGAATTTTTG CTAACATTGC AAGAAATCAG AGATAAAGCT 8 4 60 

AAAGTCAAAA TACATTTTCA TTTCGCACTT GGACAATCAA CAGGCTTGAC ACACCCTTAT 8 520 

GTCAAATGGT TTATCGAAAG CTATTTAGGT GACGATGCCA CTGCACATCC CCACGCACCT 8 58 0 

TATCACGATT ATCTGGCAAT ATTG CGTG AT TGCGATATGC TACTAAATCC GTTTCCTTTC 8640 

GGTAATACTA ACGGCATAAT TGATATGGTT ACATTAGGTT TAGTTGGTGT ATGCAAAACG 8 7 00 

GGGGATGAAG TACATGAACA TATTGATGAA GGTCTGTTTA AACGCTTAGG ACTACCAGAA 8 76 0 

TGG CTGATAG CCGACACACG AGAAACATAT ATTGAATGTG CTTTGCGTCT AGCAGAAAAC 8820 

CATCAAGAAC GCCTTGAACT CCGTCGTTAC ATCATAGAAA ACAACGGCTT ACAAAAGCTT 88 8 0 

TTTACAGGCG ACCCTCGTCC ATTGGGCAAA ATACTGCTTA AGAAAACAAA TGAATGGAAG 8 940 

CGGAAGCACT TGAGTAAAAA ATAACGGTTT TTTAAAGTAA AAGTGCGGTT AATTTTCAAA 9000 

GCGTTTTAAA AACCTCTCAA AAATCAACCG CACTTTTATC TTTATAACGC TCCCGCGCGC 9060 

TGACAGTTTA TCTCTTTCTT AAAATACCCA TAAAATTGTG GCAATAGTTG GGTAATCAAA 9120 

TTCAATTGTT GATACGGCAA ACTAAAGACG GCGCGTTCTT CGGCAGTCAT C 9171 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9323 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

CGCCACTTCA ATTTTGGATT GTTGAAATTC AACTAACCAA AAAGTGCGGT TAAAATCTGT 60 



GGAGAAAATA 
TTGGGCATTG 
CAATCCACCA 
GCGAATACGT 
GTTGCCCAAA 
TTCAATACCT 
TCCCACTCAA 
ATGACAAACA 
AG T AT AAATC 
TTTCATCTTT 
ATCTTTCATC 
GAATGAAGAG 
TAGGAGAAAA 
TTGCTGTGTC 
CTGCTCGCAT 
TAGGTGTAAC 
TGAAATGGTG 
ACGCTATCAT 
AAGAAAACAA 
AAGGGATTTT 
GTAAAGACGC 
AAAACATCAA 
TTGTGAATCA 
AAGTGAAAAA 
AAAAAATCAC 
AAAATGAAGC 
CTGCCACTAT 
G CAAT ATTGT 
AAAATCAGCA 
CAGGTGCAGT 
AGCGCGGCGA 
CAACCATCAA 
CGTTAATTGA 
TTGTGGAGAC 



GGTTGTAGTG 
GTTGGCGTTT 
ACAACTTTAC 
AATCCCATTT 
AATAAATTTT 
ATTTGTGGCG 
ATCAACTGGT 
ACAATTACAA 
CGCCATATAA 
CATCTTTCAT 
TTTCATCTTT 
GGAGCTGAAC 
TATGAACAAG 
TGAATTGGCA 
GAAAGTGCGT 
ATCTATTCCA 
CAGTTTTTAC 
TAATTGGAAA 
CAACTCCGCC 
AGATTCTAAC 
AATTATTAAC 
GGCGCGTAAT 
CGGTTTAATT 
CGAGGGTGTG 
CATCAGCGAT 
GGTCAATCTG 
TCGAAACCAA 
TCTTTCCGCC 
AGCTAAAGGC 
TATCGACCTT 
AGGTAAAAAC 
TGTATCAGGC 
CGGCAATATT 
ATCGGGGCAT 



AAGAACGAGG 
CTTTTTCGGT 
CGTTGGTTTT 
TTTGTTTAGC 
GATGTTCTAA 
AAATCGCCAA 
TAAATATACA 
CAC CTTTTTT 
AATGGTATAA 
CTTTCATCTT 
CACATGAAAT 
GAACGCAAAT 
ATATATCGTC 
CGGGGTTGTG 
CACTTAGCGT 
CAATCTGTTT 
AAGAAAACAA 
CAATTTAACA 
GTATTCAACC 
GGACAAGTCT 
ACTAATGGCT 
TTCACCTTCG 
ACTGTCGGTA 
ATTAGCGTAA 
ATAATAAACC 
GGCGATATTT 
GGTAAACTTT 
AAAGAGGGTG 
GGCAAGCTGA 
TCAGGTAAAG 
GGCATTCAAT 
AAAGAAAAAG 
AACGCTCAAG 
TATTTATCCA 
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TAATTGTTCA 
T AATAG T AAA 
AAGCGTTAAT 
AAGAAAATGA 
AATCATAAAT 
TTTTAATTCA 
AGATAATAAA 
GCAGTCTATA 
TCTTTCATCT 
TGATCTTTCA 
GATGAACCGA 
GATAAAGTAA 
TCAAATTCAG 
ACCATTCCAC 
TAAAGCCACT 
TAGCAAGCGG 
GTAATAAAAC 
TCGACCAAAA 
GTGTTACATC 
TTTTAATCAA 
TTACGGCTTC 
AGCAAACCAA 
AAGACGGCAG 
ATGGTGGCAG 
CAACCATTAC 
TTGCCAAAGG 
CTGCTGATTC 
AAGCGGAAAT 
TGATAAAGTC 
AAGGGGGAGA 
TAGCAAAGAA 
GCGGACGCGC 
GTAGTGGTGA 
TTGACAGCAA 



AAAG G AT AAA 
TT AT ATT C TG 
GTAAGTTCTT 
TCGGGATAAT 
TTTGCAAGAT 
ATTTCTTGTA 
AATAAATCAA 
TG CAAATATT 
TTCATCTTTC 
TCTTTCATCT 
GGGAAGGGAG 
TTTAATTGTT 
CAAACGCCTG 
AGAAAAAGGC 
TTCCGCTATG 
CAATTTAACA 
CATTATCCGC 
TGAAATGGTG 
TAACCAAATC 
CCCAAATGGT 
TACGCTAGAC 
AGATAAAGCG 
TGTAAATCTT 
CATTTCTTTA 
TTACAGCATT 
CGGTAACATT 
TGTAAGCAAA 
TGGCGGTGTA 
CGATAAAGTC 
AACTTACCTT 
AACCTCTTTA 
TATTGTGTGG 
TATCGCTAAA 
TGCAATTGTT 



GCTCTCTTAA 
GACGACTATG 
GCTCTTCTTG 
CATAATAGGT 
ATTGTGGCAA 
GCATAATATT 
GATTTTTGTG 
TTAAAAAAAT 
ATCTTTCATC 
TTCATCTTTC 
GGAGGGGCAA 
CAACTAACCT 
AATGCTTTGG 
AGCGAAAAAC 
TTACTATCTT 
TCGACCAAAA 
AACAGTGTTG 
CAGTTTTTAC 
TCCCAATTAA 
ATCACAATAG 
ATTTCTAACG 
CTCGCTGAAA 
ATTGGTGGCA 
CTCGCAGGGC 
GCCGCGCCTG 
AATGTCCGTG 
GATAAAAGCG 
ATTTCCG CTC 
ACATTAAAAA 
GGCGGTGACG 
GAAAAAGGCT 
GGCGATATTG 
ACCGGTGGTT 
AAAACAAAAG 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
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AGTGGTTGCT AGACCCTGAT GATGTAACAA TTGAAGCCGA AGACCCCCTT CGCAATAATA 
CCGGTATAAA TGATGAATTC CCAACAGGCA CCGGTGAAGC AAGCGACCCT AAAAAAAATA 
G-GAACTCAA AACAACGCTA ACCAATACAA CTATTTCAAA TTATCTGAAA AACGCCTGGA 
CAATGAATAT AACGGCATCA AGAAAACTTA CCGTTAATAG CTCAATCAAC ATCGGAAGCA 
ACTCCCACTT AATTCTCCAT AGTAAAGGTC AGCGTGGCGG AGGCGTTCAG ATTGATGGAG 
ATATTACTTC TAAAGGCGGA AATTTAACCA TTTATTCTGG CGGATGGGTT GATGTTCATA 
AAAATATTAC GCTTGATCAG GGTTTTTTAA ATATTACCGC CGCTTCCGTA GCTTTTGAAG 
GTGGAAATAA CAAAGCACGC GACGCGGCAA ATGCTAAAAT TGTCGCCCAG GGCAGTGTAA 
CCATTACAGG AGAGGGAAAA GATTTCAGGG CTAACAACGT ATCTTTAAAC GGAACGGGTA 
AAGGTCTGAA TATCATTTCA TCAGTGAATA ATTTAACCCA CAATCTTAGT GGCACAATTA 
ACATATCTGG GAATATAACA ATT AAC C AAA CTACGAGAAA GAACACCTCG TATTGGCAAA 
CCAGCCATGA TTCGCACTGG AACGTCAGTG CTCTTAATCT AGAGACAGGC GCAAATTTTA 
CCTTTATTAA ATACATTTCA AGCAATAGCA AAGGCTTAAC AACACAGTAT AGAAGCTCTG 
CAGGGGTGAA TTTTAACGGC GTAAATGGCA ACATGTCATT CAATCTCAAA GAAGGAGCGA 
AAGTTAATTT CAAATTAAAA CCAAACGAGA~ ACATGAACAC AAGCAAACCT TTACCAATTC 
GGTTTTTAGC CAATATCACA GCCACTGGTG GGGGCTCTGT TTTTTTTGAT ATATATGCCA 
ACCATTCTGG CAGAGGGGCT GAGTTAAAAA TGAGTGAAAT TAATATCTCT AACGGCG CTA 
ATTTTACCTT AAATTCCCAT GTTCGCGGCG ATGACGCTTT TAAAATCAAC AAAGACTTAA 
CCATAAATGC AACCAATTCA AATTTCAGCC TCAGACAGAC GAAAGATGAT TTTTATGACG 
GGTACGCACG CAATGCCATC AATTCAACCT ACAACATATC CATTCTGGGC GGTAATGTCA 
CCCTTGGTGG ACAAAACTCA AGCAGCAGCA TTACGGGGAA TATTACTATC GAGAAAGCAG 
CAAATGTTAC GCTAGAAGCC AATAACGCCC CTAATCAGCA AAACATAAGG GATAGAGTTA 
TAAAACTTGG CAGCTTGCTC GTTAATGGGA GTTTAAGTTT AACTGGCGAA AATGCAGATA 
TTAAAGGCAA TCTCACTATT TCAGAAAGCG CCACTTTTAA AGGAAAGACT AGAGATACCC 
TAAATATCAC CGGCAATTTT ACCAATAATG GCACTGCCGA AATTAATATA ACACAAGGAG 
TGGTAAAACT TGGCAATGTT ACCAATGATG GTGATTTAAA CATTACCACT CACGCTAAAC 
GCAACCAAAG AAGCATCATC GGCGGAGATA TAATCAACAA AAAAGGAAGC TTAAATATTA 
CAGACAGTAA TAATGATGCT GAAATCCAAA TTGGCGGCAA TATCTCGCAA AAAGAAGGCA 
ACCTCACGAT TTCTTCCGAT AAAATTAATA TCACCAAACA GATAACAATC AAAAAGGGTA 
TTGATGGAGA GGACTCTAGT TCAGATGCGA CAAGTAATGC CAACCTAACT ATTAAAACCA 
AAGAATTGAA ATTGACAGAA GACCTAAGTA TTTCAGGTTT CAATAAAGCA GAGATTACAG 
CCAAAGATGG TAGAGATTTA ACTATTGGCA ACAGTAATGA CGGTAACAGC GGTGCCGAAG 
CCAAAACAGT AACTTTTAAC AATGTTAAAG ATTCAAAAAT CTCTGCTGAC GGTCACAATG 
TGACACTAAA TAGCAAAGTG AAAACATCTA GCAGCAATGG CGGACGTGAA AGCAATAGCG 



2160 

2220 

2280 

2340 

2400 

2460 

2S20 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 
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ACAACGATAC CGGCTTAACT ATTACTGCAA AAAATGTAGA AG T AAAC AAA GATATTACTT 4 2 00 

CTCTCAAAAC AGTAAATATC ACCGCGTCGG AAAAGGTTAC CACCACAGCA GGCTCGACCA 4 260 

TTAACGCAAC AAATGG C AAA GCAAGTATTA CAACCAAAAC AGGTGATATC AGCGGTACGA 4 320 

TTTCCGGTAA CACGGTAAGT GTTAGCGCGA CTGGTGATTT AACCACTAAA TCCGGCTCAA 4 380 

AAATTGAAGC GAAATCGGGT GAGGCTAATG TAACAAGTGC AACAGGTACA ATTGGCGGTA 4 44 0 

CAATTTCCGG TAATACGGTA AATGTTACGG CAAACGCTGG CGATTTAACA GTTGGGAATG 4 5 00 

GCGCAGAAAT TAATGCGACA GAAGGAGCTG CAACCTTAAC CGCAACAGGG AATACCTTGA 4 56 0 

CTACTGAAGC CGGTTCTAGC ATCACTTCAA CTAAGGG TC A GGTAGACCTC TTGGCTCAGA 4 62 0 

ATGGTAGCAT CGCAGGAAGC ATTAATGCTG CTAATGTGAC ATTAAATACT ACAGGCACCT 4 6 80 

TAACCACCGT GGCAGGCTCG GATATTAAAG CAACCAGCGG CACCTTGGTT ATTAACGCAA 4 74 0 

AAGATGCTAA GCTAAATGGT GATGCATCAG GTGATAGTAC AGAAGTGAAT GCAGTCAACG 4 8 00 

ACTGGGGATT TGGTAGTGTG ACTGCGGCAA CCTCAAGCAG TGTGAATATC ACTGGGGATT 4 860 

TAAACACAGT AAATGGGTTA AATATCATTT CGAAAGATGG TAGAAACACT GTGCGCTTAA 4 9 20 

GAGGCAAGGA AATTGAGGTG AAATATATCC AGCCAGGTGT AG C AAGTGT A GAAGAAGTAA 4 98 0 

TTGAAGCGAA ACGCGTCCTT GAAAAAGTAA AAGATTTATC TGATGAAGAA AGAGAAACAT 504 0 

TAGCTAAACT TGGTGTAAGT GCTGTACGTT TTGTTGAGCC AAATAATACA ATT AC AG TCA 5100 

ATACACAAAA TGAATTTACA ACCAGACCGT CAAGTCAAGT GATAATTTCT GAAGGTAAGG 5160 

CGTGTTTCTC AAGTGGTAAT GGCGCACGAG TATGTACCAA TGTTGCTGAC GATGGACAGC 52 20 

CGTAGTCAGT AATTGACAAG GTAGATTTCA TCCTGCAATG AAGTCATTTT ATTTTCGTAT 52 8 0 

TATTTACTGT GTGGGTTAAA GTTCAGTACG GGCTTTACCC ATCTTGTAAA AAATTACGGA 53 40 

GAATACAATA AAGTATTTTT AACAGGTTAT TATTATGAAA AATATAAAAA GCAGATTAAA 54 00 

ACTCAGTGCA ATATCAGTAT TGCTTGGCCT GGCTTCTTCA TCATTGTATG CAGAAGAAGC 54 60 

GTTTTTAGTA AAAGGCTTTC AGTTATCTGG TGCACTTGAA ACTTTAAGTG AAGACGCCCA 5 520 

ACTGTCTGTA GCAAAATCTT TATCTAAATA CCAAGGCTCG CAAACTTTAA CAAACCTAAA 5 5 80 

AACAGCACAG CTTGAATTAC AGGCTGTGCT AGATAAGATT GAGCCAAATA AATTTGATGT 5 640 

GATATTGCCG CAACAAACCA TTACGGATGG CAATATCATG TTTGAGCTAG TCTCGAAATC 5700 

AGCCGCAGAA AGCCAAGTTT TTTATAAGGC GAGCCAGGGT TATAGTGAAG AAAATATCGC 5760 

TCGTAGCCTG CCATCTTTGA AACAAGGAAA AGTGT ATGAA GATGGTCGTC AGTGGTTCGA 5820 

TTTGCGTGAA TTTAATATGG CAAAAGAAAA CCCGCTTAAG GTTACCCGTG TACATTACGA 58 80 

ACTAAACCCT AAAAACAAAA CCTCTAATTT GATAATTGCG GGCTTCTCGC CTTTTGGTAA 5940 

AACGCGTAGC TTTATTTCTT ATGATAATTT CGGCGCGAGA GAGTTTAACT ACCAACGTGT 6000 
AAGCTTGGGT TTTGTTAATG CCAATTTAAC TGGTCATGAT GATGTGTTAA TTATACCAGT 6060 
ATGAGTTATG CTGATTCTAA TGATATCGAC GGCTTACCAA GTGCGATTAA TCGTAAATTA 6120 
TCAAAAGGTC AATCTATCTC TGCGAATCTG AAATGGAGTT ATTATCTCCC AACATTTAAC 6180 
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CTTGGCATGG 


AAGACCAATT 


TAAAATTAAT 


TTAGGCTACA 


ACTACCGCCA 


TATTAATCAA 


6240 


ACCTCCGCGT 


TAAATCGCTT 


GGGTGAAACG 


AAGAAAAAAT 


TTGCAGTATC 


AGGCGTAAGT 


6300 


GCAGGCATTG 


ATGGACATAT 


CCAATTTACC 


CCTAAAACAA 


TCTTTAATAT 


TGATTTAACT • 


6360 


CATCATTATT 


ACGCGAGTAA 


ATTACCAGGC 


TCTTTTGGAA 


TGGAGCGCAT 


TGGCGAAACA 


6420 


TTTAATCGCA 


GCTATCACAT 


TAGCACAGCC 


AGTTTAGGGT 


TGAGTCAAGA 


GTTTGCTCAA 


6480 


GGTTGGCATT 


TT AG CAGTC A 


ATTATCAGGT 


CAATTTACTC 


TACAAGATAT 


TAGCAGTATA 


6540 


GATTTATTCT 


CTGTAACAGG 


TACTTATGGC 


GTCAGAGGCT 


TTAAATACGG 


CGGTGCAAGT 


6600 


GGTGAGCGCG 


GTCTTGTATG 


GCGTAATGAA 


TTAAGTATGC 


CAAAATACAC 


CCGCTTCCAA 


6660 


ATCAGCCCTT 


ATGCGTTTTA 


TGATGCAGGT 


CAGTTCCGTT 


ATAATAGCGA 


AAATGCTAAA 


6720 


ACTTACGGCG 


AAGATATGCA 


CACGGTATCC 


TCTGCGGGTT 


TAGGCATTAA 


AACCTCTCCT 


6780 


ACACAAAACT 


TAAGCCTAGA 


TGCTTTTGTT 


GCTCGTCGCT 


TTGCAAATGC 


CAATAGTGAC 


6840 


AATTTGAATG 


GCAACAAAAA 


ACGCACAAGC 


TCACCTACAA 


CCTTCTGGGG 


GAGATTAACA 


6900 


TTCAGTTTCT 


AACCCTGAAA 


TTTAATCAAC 


TGGTAAGCGT 


TCCGCCTACC 


AGTTTATAAC 


6960 


TATATGCTTT 


ACCCGCCAAT 


TTACAGTCTA 


TAGGCAACCC 


TGTTTTTACC 


CTTATATATC 


7020 


AAATAAACAA 


GCTAAGCTGA 


GCTAAGCAAA 


CCAAGCAAAC 


TCAAGCAAGC 


CAAGTAATAC 


7080 


TAAAAAAACA 


ATTTATATGA 


TAAACTAAAG 


TATACTCCAT 


GCCATGGCGA 


TACAAGGGAT 


7140 


TTAATAATAT 


GACAAAAGAA 


AATTTGCAAA 


ACGCTCCTCA 


AGATGCGACC 


GCTTTACTTG 


7200 


CGGAATTAAG 


CAACAATCAA 


ACTCCCCTGC 


GAATATTTAA 


ACAACCACGC 


AAGCCCAGCC 


7260 


TATTACGCTT 


GGAACAACAT 


ATCGCAAAAA 


AAGATTATGA 


GTTTGCTTGT 


CGTGAATTAA 


7320 


TGGTGATTCT 


GGAAAAAATG 


GACGCTAATT 


TTGGAGGCGT 


TCACGATATT 


GAATTTGACG 


7380 


CACCCGCTCA 


GCTGGCATAT 


CTACCCGAAA 


AATTACTAAT 


TTATTTTGCC 


ACTCGTCTCG 


7440 


CTAATGCAAT 


TACAACACTC 


TTTTCCGACC 


CCGAATTGGC 


AATTTCTGAA 


GAAGGGGCGT ' 


7500 


TAAAGATGAT 


TAGCCTGCAA 


CGCTGGTTGA 


CGCTGATTTT 


TGCCTCTTCC 


CCCTACGTTA 


7560 


ACGCAGACCA 


TATTCTCAAT 


AAATATAATA 


TCAACCCAGA 


TTCCGAAGGT 


GGCTTTCATT 


7620 


TAGCAACAGA 


CAACTCTTCT 


ATTGCTAAAT 


TCTGTATTTT 


TTACTTACCC 


GAATCCAATG 


7680 


TCAATATGAG 


TTTAGATGCG 


TTATGGGCAG 


GGAATCAACA 


ACTTTGTGCT 


TCATTGTGTT 


7740 


TTGCGTTGCA 


GTCTTCACGT 


TTTATTGGTA 


CCGCATCTGC 


GTTTCATAAA 


AGAGCGGTGG 


7800 


TTTTACAGTG 


GTTTCCTAAA 


AAACTCGCCG 


AAATTGCTAA 


TTTAGATGAA 


TTGCCTGCAA 


7860 


ATATCCTTCA 


TGATGTATAT 


ATGCACTGCA 


GTTATGATTT 


AGCAAAAAAC 


AAGCACGATG 


7920 


TTAAGCGTCC 


ATTAAACGAA 


CTTGTCCGCA 


AGCATATCCT 


CACGCAAGGA 


TGGCAAGACC 


7980 


GCTACCTTTA 


CAC CTTAGGT 


AAAAAGGACG 


GCAAACCTGT 


GATGATGGTA 


CTGCTTGAAC 


8040 


ATTTTAATTC 


GGGACATTCG 


ATTTATCGTA 


CACATTCAAC 


TTCAATGATT 


GCTGCTCGAG 


8100 


AAAAATTCTA 


TTTAGTCGGC 


TTAGGCCATG 


AGGGCGTTGA 


TAAAATAGGT 


CGAGAAGTGT 


8160 


TTGACGAGTT 


CTTTGAAATC 


AGTAGCAATA 


ATATAATGGA 


GAGACTGTTT 


TTTATCCGTA 


8220 
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AACAGTGCGA 


AACTTTCCAA 


CCCGCAGTGT 


TCTATATGCC 


AAGCATTGGC 


ATGGATATTA 


8280 


CCACGATTTT 


TGTGAGCAAC 


ACTCGGCTTG 


CCCCTATTCA 


AGCTGTAGCC 


CTGGGTCATC 


8340 


CTGCCACTAC 


GCATTCTGAA 


TTTATTGATT 


ATGTCATCGT 


AGAAGATGAT 


T ATGTGGG C A 


8400 


GTGAAGATTG 


TTTCAGCGAA 


ACCCTTTTAC 


GCTTACCCAA 


AGATGCCCTA 


CCTTATGTAC 


8460 


CTTCTGCACT 


CGCCCCACAA 


AAAGTGGATT 


ATGTACTCAG 


GGAAAACCCT 


GAAGTAGTCA 


8520 


ATATCGGTAT 


TGCCGCTACC 


ACAATGAAAT 


TAAACCCTGA 


ATTTTTG CT A 


ACATTGCAAG 


8580 


AAATCAGAGA 


TAAAGCTAAA 


GTCAAAATAC 


ATTTTCATTT 


CGCACTTGGA 


CAATCAACAG 


8640 


GCTTGACACA 


CCCTTATGTC 


AAATGGTTTA 


TCGAAAGCTA 


TTTAGGTGAC 


GATGCCACTG 


8700 


CACATCCCCA 


CGCACCTTAT 


CACGATTATC 


TGGCAATATT 


GCGTGATTGC 


GATATGCTAC 


8760 


TAAATCCGTT 


TCCTTTCGGT 


AATACTAACG 


G CAT AATTG A 


TATGGTTACA 


TTAGGTTTAG 


8820 


TTGGTGTATG 


CAAAACGGGG 


GATGAAGTAC 


ATGAACATAT 


TGATGAAGGT 


CTGTTTAAAC 


8880 


GCTTAGGACT 


ACCAGAATGG 


CTGATAGCCG 


ACACACGAGA 


AACATATATT 


GAATGTGCTT 


8940 


TGCGTCTAGC 


AGAAAACCAT 


CAAGAACGCC 


TTGAACTCCG 


TCGTTACATC 


ATAGAAAACA 


9000 


ACGGCTTACA 


AAAGCTTTTT 


ACAGGCGACC 


CTCGTCCATT 


GGGCAAAATA 


CTGCTTAAGA 


9060 


AAACAAATGA 


ATGGAAGCGG 


AAGCACTTGA 


GTAAAAAATA 


ACGGTTTTTT 


AAAGTAAAAG 


9120 


TGCGGTTAAT 


TTTCAAAGCG 


TTTTAAAAAC 


GTCTCAAAAA 


TCAACCGCAC 


TTTTATCTTT 


9180 


ATAACGATCC 


CGCACGCTGA 


CAGTTTATCA 


GCCTCCCGCC 


ATAAAACTCC 


GCCTTTCATG 


9240 


GCGGAGATTT 


TAG C C AAAAC 


TGGCAGAAAT 


TAAAGGCTAA 


AATCACCAAA 


TTGCACCACA 


9300 


AAATCACCAA 


TACCCACAAA 


AAA 
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(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 794 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 
ATGAACAAGA TATATCGTCT CAAATTCAGC AAACGCCTGA ATGCTTTGGT TGCTGTGTCT 
GAATTGACAC GGGGTTGTGA CCATTCCACA GAAAAAGGCA GTGAAAAACC TGTTCGTACG 
AAAGTACGCC ACTTGGCGTT AAAGCCACTT TCCGCTATAT TGCTATCTTT GGGCATGGCA 
TCCATTCCGC AATCTGTTTT AGCGAGCGGT TTACAGGGAA TGAGCGTCGT ACACGGTACA 
GCAACCATGC AAGTAGACGG CAATAAAACC ACTATCCGTA ATAGCGTCAA TGCTATCATC 
AATTGGAAAC AATTTAACAT TGACCAAAAT GAAATGGTGC AGTTTTTACA AGAAAGCAGC 
AACTCTGCCG TTTTCAACCG TGTTACATCT GACCAAATCT CCCAATTAAA AGGGATTTTA 



60 
120 
180 
240 
300 
360 
420 
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GATTCTAACG GACAAGTCTT TTTAATCAAC CCAAATGGTA TCACAATAGG TAAAG ACG C A 
ATTATTAACA CTAATGGCTT TACTGCTTCT ACGCTAGACA TTTCTAACGA AAACATCAAG 
GCGCGTAATT TCACCCTTGA GCAAACCAAG GATAAAGCAC TCGCTGAAAT CGTGAATCAC 
GGTTTAATTA CCGTTGGTAA AGACGGTAGC GTAAACCTTA TTGGTGGCAA AGTGAAAAAC 
GAGGGCGTGA TTAG CGT AAA TGGCGGTAGT ATTTCTTTAC TTGCAGGGCA AAAAATCACC 
ATCAGCGATA TAATAAATCC AACCATCACT TACAGCATTG CTGCACCTGA AAACGAAGCG 
ATCAATCTGG GCGATATTTT TGCCAAAGGT GGTAACATTA ATGTCCGCG C TGCCACTATT 
CGCAATAAAG GTAAACTTTC TGCCGACTCT GTAAGCAAAG ATAAAAGTGG TAACATTGTT 
CTCTCTGCCA AAGAAGGTGA AG CGG AAATT GGCGGTGTAA TTTCCGCTCA AAATCAGCAA 
GCCAAAGGTG GTAAGTTGAT GATTACAGGC GATAAAGTTA CATTGAAAAC GGGTGCAGTT 
ATCGACCTTT CGGGTAAAGA AGGGGGAGAA ACTTATCTTG G CGGTG ACG A GCGTGGCGAA 
GGTAAAAACG GCATTCAATT AGCAAAGAAA ACCACTTTAG AAAAAGGCTC AACAATTAAT 
GTGTCAGGTA AAGAAAAAGG TGGGCGCGCT ATTGTATGGG GCGATATTGC GTTAATTGAC 
GGCAATATTA ATGCCCAAGG TAAAGATATC GCTAAAACTG GTGGTTTTGT GGAGACGTCG 
GGGCATTACT TATCCATTGA TGATAACGCA ATTGTTAAAA CAAAAGAATG GCTACTAGAC 
CCAGAGAATG TGACTATTGA AGCTCCTTCC GCTTCTCGCG TCGAGCTGGG TGCCGATAGG 
AATTCCCACT CGGCAGAGGT GATAAAAGTG ACCCTAAAAA AAAATAACAC CTCCTTGACA 
ACACTAACCA ATACAACCAT TTCAAATCTT CTGAAAAGTG CCCACGTGGT GAACATAACG 
GCAAGGAGAA AACTTACCGT TAATAGCTCT ATCAGTATAG AAAGAGGCTC CCACTTAATT 
CTCCACAGTG AAGGTCAGGG CGGTCAAGGT GTTCAGATTG ATAAAGATAT TACTTCTGAA 
GGCGG AAATT TAACCATTTA TTCTGGCGGA TGGGTTGATG TTCATAAAAA TATTACGCTT 
GGTAGCGGCT TTTTAAACAT CACAACTAAA GAAGGAGATA TCG CCTTCG A AGACAAGTCT " 
GGACGGAACA ACCTAACCAT TACAGCCCAA GGGACCATCA CCTCAGGTAA TAGTAACGGC 
TTTAGATTTA ACAACGTCTC TCTAAACAGC CTTGGCGGAA AGCTGAGCTT TACTGACAGC 
AGAGAGGACA GAGGTAGAAG AACTAAGGGT AATATCTCAA ACAAATTTGA CGGAACGTTA 
AACATTTCCG GAACTGTAGA TATCTCAATG AAAGCACCCA AAGTCAGCTG GTTTTACAGA 
GACAAAGGAC GCACCTACTG GAACGTAACC ACTTTAAATG TTACCTCGGG TAGTAAATTT 
AACCTCTCCA TTGACAGCAC AGGAAGTGGC TCAACAGGTC CAAGCATACG CAATGCAGAA 
TTAAATGGCA TAACATTTAA TAAAGCCACT TTTAATATCG CACAAGGCTC AACAGCTAAC 
TTTAGCATCA AGGCATCAAT AATGCCCTTT AAGAGTAACG CTAACTACGC ATTATTTAAT 
GAAGATATTT CAGTCTCAGG GGGGGGTAGC CTTAATTTCA AACTTAACGC CTCATCTAGC 
AACATACAAA CCCCTGGCGT AATTATAAAA TCTCAAAACT TTAATGTCTC AGGAGGGTCA 
ACTTTAAATC TCAAGGCTGA AGGTTCAACA GAAACCGCTT TTTCAATAGA AAATGATTTA 
AACTTAAACG CCACCGGTGG CAATATAACA ATCAGACAAG TCGAGGGTAC CGATTCACGC 



480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
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GTCAACAAAG 


GTGTCGCAGC 


CAAAAAAAAC 


ATAACTTTTA 


AAGGGGGTAA 


TATCACCTTC 


2520 


GGCTCTCAAA 


AAGCCACAAC 


AG AAAT C AAA 


GGCAATGTTA 


CCATCAATAA 


AAACACTAAC 


2580 


GCTACTCTTT 


GTGGTGCGAA 


TTTTGCCGAA 


AACAAATCGC 


CTTTAAATAT 


AG CAGG AAAT 


2640 


GTTATTAATA 


ATGGCAACCT 


TACCACTGCC 


GGCTCCATTA 


TCAATATAGC 


CGGAAATCTT 


2700 


ACTGTTTCAA 


AAGGCGCTAA 


CCTTCAAGCT 


ATAACAAATT 


ACACTTTTAA 


TGTAGCCGGC 


2760 


TCATTTGACA 


ACAATGGCGC 


TTCAAACATT 


TCCATTGCCA 


GAGGAGGGGC 


TAAATTTAAA 


2820 


GATATCAATA 


ACACCAGTAG 


CTT AAAT ATT 


ACCACCAACT 


CTGATACCAC 


TTACCGCACC 


2880 


ATTATAAAAG 


GCAATATATC 


CAACAAATCA 


GGTGATTTGA 


ATATTATTGA 


TAAAAAAAGC 


2940 


GACGCTGAAA 


TCCAAATTGG 


CGGCAATATC 


TCACAAAAAG 


AAGGCAATCT 


CACAATTTCT 


3000 


TCTGATAAAG 


TAAATATTAC 


CAATCAGATA 


ACAATCAAAG 


CAGGCGTTGA 


AGGGGGGCGT 


3060 


TCTGATTCAA 


GTGAGGCAGA 


AAATGCTAAC 


CTAACTATTC 


AAACCAAAGA 


GTTAAAATTG 


3120 


GCAGGAGACC 


TAAATATTTC 


AGGCTTTAAT 


AAAGCAGAAA 


TTACAGCTAA 


AAATGGCAGT 


3180 


GATTTAACTA 


TTGGCAATGC 


TAGCGGTGGT 


AATGCTGATG 


CTAAAAAAGT 


GACTTTTGAC 


3240 


AAGGTTAAAG 


ATTCAAAAAT 


CTCGACTGAC 


GGTCACAATG 


TAACACTAAA 


TAGCGAAGTG 


3300 


AAAACGTCTA 


ATGGTAGTAG 


CAATGCTGGT 


AATGATAACA 


GCACCGGTTT 


AACCATTTCC 


3360 


GCAAAAGATG 


TAACGGTAAA 


CAATAACGTT 


ACCTCCCACA 


AGACAATAAA 


TATCTCTGCC 


3420 


GCAGCAGGAA 


ATGTAACAAC 


CAAAGAAGGC 


ACAACTATCA 


ATGCAAC C AC 


AGGCAGCGTG 


3480 


GAAGTAACTG 


CTCAAAATGG 


TACAATTAAA 


GGCAACATTA 


CCTCGCAAAA 


TGTAACAGTG 


3540 


ACAGCAACAG 


AAAATCTTGT 


TACCACAGAG 


AATGCTGTCA 


TTAATGCAAC 


CAGCGGCACA 


3600 


GTAAACATTA 


GTACAAAAAC 


AGGGGATATT 


AAAGGTGGAA 


TTGAATCAAC 


TTCCGGTAAT 


3660 


GTAAATATTA 


CAGCGAGCGG 


CAATACACTT 


AAGGTAAGTA 


ATATCACTGG 


TCAAGATGTA 


3720 


AC AG T AACAG 


CGGATG CAGG 


AGCCTTGACA 


ACTACAGCAG 


GCTCAACCAT 


TAGTGCGACA - 


3780 


ACAGGCAATG 


CAAATATTAC 


AACCAAAACA 


GGTGATATGA 


ACGGTAAAGT 


TGAATCCAGC 


3840 


TCCGGCTCTG 


TAACACTTGT 


TGCAACTGGA 


GCAACTCTTG 


CTGTAGGTAA 


TATTTCAGGT 


3900 


AACACTGTTA 


CTATTACTGC 


GGATAGCGGT 


AAATTAACCT 


CCACAGTAGG 


TTCTACAATT 


3960 


AATGGGACTA 


ATAGTGTAAC 


CACCTCAAGC 


CAAT CAGGCG 


ATATTGAAGG 


TACAATTTCT 


4020 


GGTAATACAG 


TAAATGTTAC 


AGCAAGCACT 


GGTGATTTAA 


CTATTGGAAA 


TAGTGCAAAA 


4080 


GTTGAAGCGA AAAATGGAGC 


TGCAACCTTA 


ACTGCTGAAT 


CAGGCAAATT 


AACCACCCAA 


4140 


ACAGGCTCTA GCATTACCTC 


AAGCAATGGT 


CAGACAACTC 


TTACAGCCAA 


GGATAGCAGT 


4200 


ATCGCAGGAA ACATTAATGC 


TGCTAATGTG 


ACGTTAAATA 


CCACAGGCAC 


TTTAACTACT 


4260 


ACAGGGGATT 


CAAAGATTAA 


CGCAACCAGT 


GGTACCTTAA 


CAATCAATGC 


AAAAGATGCC 


4320 


AAATTAGATG 


GTGCTGCATC 


AGGTGACCGC 


ACAGTAGTAA 


ATGCAACTAA 


CGCAAGTGGC 


4380 


TCTGGTAACG 


TGACTGCGAA 


AACCTCAAGC 


AGCGTGAATA 


TCACCGGGGA 


TTTAAACACA 


4440 


ATAAATGGGT 


TAAATATCAT 


TTCGGAAAAT 


GGTAGAAACA 


CTGTGCGCTT 


AAGAGGCAAG 


4500 
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GAAATTGATG TGAAATATAT CCAACCAGGT GTAGCAAGCG TAGAAGAGGT AATTGAAGCG 4 56 0 

AAACGCGTCC TTGAGAAGGT AAAAGATTTA TCTGATGAAG AAAGAGAAAC ACTAGCCAAA 4 6 20 

CTTGGTGTAA GTGCTGTACG TTTCGTTGAG CCAAATAATG CCATTACGGT TAATACACAA ■ 46 80 

AACGAGTTTA CAACCAAACC ATCAAGTCAA GTGACAATTT CTGAAGGTAA GGCGTGTTTC 4 74 0 

TCAAGTGGTA ATGGCGCACG AGTATGTACC AATGTTGCTG ACGATGGACA GCAG 4 7 94 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4803 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

ATGAACAAGA TATATCGTCT CAAATTCAGC AAACG CCTG A ATGCTTTGGT TGCTGTGTCT 6 0 

GAATTGACAC GGGGTTGTGA CCATTCCACA GAAAAAGGCA GTGAAAAACC TGTTCGTACG 12 0 

AAAGTACGCC ACTTGG CGTT AAAGCCACTT TCCGCTATAT TGCTATCTTT GGGCATGGCA 18 0 

TCCATTCCGC AATCTGTTTT AG CGAGCGGT TTACAGGGAA TGAGCGTCGT ACACGGTACA 24 0 

GCAACCATGC AAGTAGACGG CAATAAAACC ACTATCCGTA ATAGCGTCAA TGCTATCATC 3 00 

AATTGGAAAC AATTTAACAT TGACCAAAAT GAAATGGTGC AGTTTTTACA AGAAAGCAGC 360 

AACTCTGCCG TTTTCAACCG TGTTACATCT GACCAAATCT CCCAATTAAA AGGGATTTTA 42 0 

GATTCTAACG GACAAGTCTT TTTAATCAAC CCAAATGGTA TCACAATAGG TAAAGACGCA 48 0 

ATTATTAACA CTAATGGCTT TACTGCTTCT ACGCTAGACA TTTCTAACGA AAACATCAAG _ 54 0 

GCGCGTAATT TCACCCTTGA GCAAACCAAG GATAAAGCAC TCGCTGAAAT CGTGAATCAC 600 

GGTTTAATTA CCGTTGGTAA AGACGGTAGC GTAAACCTTA TTGGTGGCAA AGTGAAAAAC 660 

GAGGGCGTGA TTAGCGTAAA TGGCGGTAGT ATTTCTTTAC TTGCAGGGCA AAAAATCACC 72 0 

ATCAGCGATA TAATAAATCC AACCATCACT TACAGCATTG CTGCACCTGA AAACGAAGCG 780 

ATCAATCTGG GCGATATTTT TGCCAAAGGT GGTAACATTA ATGTCCGCGC TGCCACTATT 840 

CGCAATAAAG GTAAACTTTC TGCCGACTCT GTAAGCAAAG ATAAAAGTGG TAACATTGTT 900 

CTCTCTGCCA AAGAAGGTGA AGCGGAAATT GGCGGTGTAA TTTCCGCTCA AAATCAGCAA 960 

GCCAAAGGTG GTAAGTTGAT GATTACAGGT GATAAAGTCA CATTAAAAAC AGGTGCAGTT 1020 

ATCGACCTTT CAGGTAAAGA AGGGGGAGAG ACTTATCTTG GCGGTGATGA GCGTGGCGAA 1080 

GGTAAAAATG GTATTCAATT AGCGAAGAAA ACCTCTTTAG AAAAAGGCTC GACAATTAAT 1140 

GTATCAGGCA AAGAAAAAGG CGGGCGCGCT ATTGTATGGG GCGATATTGC ATTAATTAAT 1200 

GGTAACATTA ATGCTCAAGG TAGCGATATT GCTAAAACTG GCGGCTTTGT GGAAACATCA 12 60 



r — 
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GGACATGACT TATCCATTGG TGATGATGTG ATTGTTGACG CTAAAGAGTG GTTATTAGAC 
CCAGATGATG TGTCCATTGA AACTCTTACA TCTGGACGCA ATAATACCGG CGAAAACCAA 
GGATATACAA CAGGAGATGG GACTAAAGAG TCACCTAAAG GT AATAG TAT TTCTAAACCT ■ 
ACATTAACAA ACTCAACTCT TGAGCAAATC CTAAGAAGAG GTTCTTATGT TAATATCACT 
GCTAATAATA GAATTTATGT TAATAGCTCC ATCAACTTAT CTAATGGCAG TTTAACACTT 
CACACTAAAC GAGATGGAGT TAAAATTAAC GGTGATATTA CCTCAAACGA AAATGGTAAT 
TTAACCATTA AAGCAGGCTC TTGGGTTGAT G TTC ATAAAA ACATCACGCT TGGTACGGGT 
TTTTTGAATA TTGTCGCTGG GGATTCTGTA GCTTTTGAGA GAGAGGGCGA TAAAGCACGT 
AACGCAACAG ATGC TCAAAT TACCGCACAA GGGACGATAA CCGTCAATAA AGATGATAAA 
CAATTTAGAT TCAATAATGT ATCTATTAAC GGGACGGGCA AGGGTTTAAA GTTTATTGCA 
AATCAAAATA ATTTCACTCA TAAATTTGAT GGCGAAATTA ACATATCTGG AATAG T AAC A 
ATTAACCAAA CCACGAAAAA AGATGTTAAA TACTGGAATG CATCAAAAGA CTCTTACTGG 
AATGTTTCTT CTCTTACTTT GAATACGGTG CAAAAATTTA CCTTTATAAA ATTCGTTGAT 
AGCGGCTCAA ATTCCCAAGA TTTGAGGTCA TCACGTAGAA GTTTTGCAGG CGTACATTTT 
AACGG C ATCG GAGGCAAAAC AAACTTCAAC ATCGGAGCTA ACGCAAAAGC CTTATTTAAA 
TTAAAACCAA ACGCCGCTAC AGACCCAAAA AAAGAATTAC CTATTACTTT TAACG CCAAC 
ATTACAGCTA CCGGTAACAG TGATAGCTCT GTGATGTTTG ACATACACGC CAATCTTACC 
TCTAGAGCTG CCGGCATAAA CATGGATTCA ATTAACATTA CCGGCGGGCT TG ACTTTTC C 
ATAACATCCC ATAATCGCAA TAGTAATGCT TTTGAAATCA AAAAAGACTT AACTATAAAT 
GCAACTGGGT CGAATTTTAG TCTTAAGCAA ACGAAAGATT CTTTTTATAA TGAATACAGC 
AAACACGCCA TTAACTCAAG TCATAATCTA ACCATTCTTG GCGGCAATGT CACTCTAGGT 
GGGGAAAATT CAAGCAGTAG CATT ACGGG C AATATCAATA TCACCAATAA AGCAAATGTT " 
ACATTACAAG CTGACACCAG CAACAGCAAC ACAGGCTTGA AGAAAAGAAC TCTAACTCTT 
GGCAATATAT CTGTTGAGGG GAATTTAAGC CTAACTGGTG CAAATGCAAA CATTGTCGGC 
AATCTTTCTA TTGCAGAAGA TTC CACATTT AAAGGAGAAG CCAGTGACAA CCTAAACATC 
ACCGGCACCT TTACCAACAA CGGTACCGCC AACATTAATA TAAAACAAGG AGTGGTAAAA 
CTCCAAGGCG ATATTATCAA TAAAGGTGGT TTAAATATCA CTACTAACGC CTCAGGCACT 
CAAAAAACCA TTATTAACGG AAATATAACT AACGAAAAAG GCGACTTAAA CATCAAGAAT 
ATTAAAGCCG ACGCCGAAAT CCAAATTGGC GGCAATATCT CACAAAAAGA AGGCAATCTC 
ACAATTTCTT CTGATAAAGT AAATATTACC AATCAGATAA CAATCAAAGC AGGCGTTGAA 
GGGGGGCGTT CTGATTCAAG TGAGGCAGAA AATGCTAACC TAACTATTCA AACCAAAGAG 
TTAAAATTGG CAGGAGACCT AAATATTTCA GGCTTTAATA AAGCAGAAAT TACAGCTAAA 
AATGGCAGTG ATTTAACTAT TGGCAATGCT AGCGGTGGTA ATGCTGATGC TAAAAAAGTG 
ACTTTTGACA AGGTTAAAGA TTCAAAAATC TCGACTGACG GTCACAATGT AACACTAAAT 
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1860 
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2580 
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2820 
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AGCGAAGTGA 


AAACGTCTAA 


TG G TAG TAG C 


AATGC 1TGGTA 


ATGATAACAG 


CACCGGTTTA 


3360 


ACCATTTCCG 


CAAAAGATGT 


AACGGTAAAC 


AATA* ;GTTA 


CCTCCCACAA 


GACAATAAAT 


3420 


ATCTCTGCCG 


CAGCAGGAAA 


TGTAACAACC 


AAAGAAGGOA 


CAACTATCAA 


TGCAACCACA 


3480 


GGCAGCGTGG 


AAG T AA CTG C 


TCAAAATGGT 


ACAA\L TAAAG 


GCAACATTAC 


CTCGCAAAAT 


3540 


GTAACAGTGA 


CAGCAACAGA 


AAATCTTGTT 


ACCACAGAGA 


ATGCTGTCAT 


TAATGCAACC 


3600 


AGCGGCACAG 


TAAACATTAG 


TACAAAAACA 


GGGGATATTA 


AAGGTGGAAT 


TGAATCAACT 


3660 


TCCGGTAATG 


TAAATATTAC 


AGCGAGCGGC 


AATACACTTA 


AGGTAAGTAA 


TATCACTGGT 


3720 


CAAGATGTAA 


C AG T AA CAG C 


GGATGCAGGA 


GCCTTGACAA 


CTACAGCAGG 


CTCAACCATT 


3780 


AGTGCGACAA 


CAGGCAATGC 


AAATATTACA 


ACCAAAACAG 


GTGATATCAA 


CGGTAAAGTT 


3840 


GAATCCAGCT 


CCGGCTCTGT 


AACACTTGTT 


GCAACTGGAG 


CAACTCTTGC 


TGTAGGTAAT 


3900 


ATTTCAGGTA 


ACACTGTTAC 


TATTACTGCG 


GATAGCGGTA 


AATTAACCTC 


CACAGTAGGT 


3960 


TCTACAATTA 


ATGGGACTAA 


TAGTGTAACC 


ACCTCAAGCC 


AATCAGGCGA 


TATTGAAGGT 


4020 


ACAATTTCTG 


GTAATACAGT 


AAATGTTACA 


GCAAGCACTG 


GTGATTTAAC 


TATTGGAAAT 


4080 


AGTGCAAAAG 


TTGAAGCGAA 


AAATGGAGCT 


GCAACCTTAA 


CTGCTGAATC 


AGGCAAATTA 


4140 


ACCACCCAAA 


CAGGCTCTAG 


CATTACCTCA 


AGCAATGGTC 


AGACAACTCT 


TACAGCCAAG 


4200 


GATAGCAGTA 


TCGCAGGAAA 


CATTAATGCT 


GCTAATGTGA 


CGTTAAATAC 


CACAGGCACT 


4260 


TTAACTACTA 


CAGGGGATTC 


AAAGATTAAC 


GCAACCAGTG 


GTACCTTAAC 


AATCAATGCA 


4320 


AAAGATGCCA 


AATTAGATGG 


TGCTGCATCA 


GGTGACCGCA 


CAGTAGTAAA 


TGCAACTAAC 


4380 


GCAAGTGGCT 


CTGGTAACGT 


GACTGCGAAA 


ACCTCAAGCA 


GCGTGAATAT 


CACCGGGGAT 


4440 


TTAAACACAA 


TAAATGGGTT 


AAATATCATT 


TCGGAAAATG 


GTAGAAACAC 


TGTGCGCTTA 


4500 


AGAGGCAAGG 


AAATTGATGT 


GAAATATATC 


CAACCAGGTG 


TAGCAAGCGT 


AGAAGAGGTA 


- 4560 


ATTGAAGCGA 


AACGCGTCCT 


TGAGAAGGTA 


AAAGATTTAT 


CTGATGAAGA 


AAGAGAAACA 


4620 


CTAGCCAAAC 


TTGGTGTAAG 


TGCTGTACGT 


TTCGTTGAGC 


CAAATAATGC 


CATTACGGTT 


4680 


AATACACAAA 


ACGAGTTTAC 


AACCAAACCA 


TCAAGTCAAG 


TGACAATTTC 


TGAAGGTAAG 


4740 


GCGTGTTTCT 


CAAGTGGTAA 


TGGCGCACGA 


GTATGTACCA 


ATGTTGCTGA 


CGATGGACAG 


4800 


CAG 
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(2) INFORMATION FOR SEQ ID NO: 9: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1599 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



93 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Met Asn Lys lie Tyr Arg Leu Lys Phe Ser Lvs Arg Leu Asn Ala Leu 
1 5 10 15 

Val Ala Val Ser Glu Leu Thr Arg Gly Cvs Aso His Ser Thr Glu Lys 
20 25 ' 30 

Gly Ser Glu Lys Pro Val Arg Thr Lys Val Arc His Leu Ala Leu Lys 
35 40 45 

Pro Leu Ser Ala lie Leu Leu Ser Leu Gly Met Ala Ser lie Pro Gin 
50 55 60 

Ser Val Leu Ala Ser Gly Leu Gin Gly Met Ser Val Val His Gly Thr 
65 70 75 80 

Ala Thr Met Gin Val Asp Gly Asn Lys Thr Thr He Arg Asn Ser Val 
85 90 95 

Asn Ala He He Asn Trp Lys Gin Phe Asn He Asp Gin Asn Glu Met 
100 105 110 

Glu Gin Phe Leu Gin Glu Ser Ser Asn Ser Ala Val Phe Asn Arg Val 
H5 120 125 

Thr Ser Asp Gin He Ser Gin Leu Lys Gly He Leu Asp Ser Asn Gly 
130 135 140 

Gin Val Phe Leu He Asn Pro Asn Gly He Thr He Gly Lys Asp Ala 
145 150 1-55 ' 160 

He He Asn Thr Asn Gly Phe Thr Ala Ser Thr Leu Asd He Ser Asn 
165 170 ' 175 

Glu Asn He Lys Ala Arg Asn Phe Thr Leu Glu Gin Thr Lys Asp Lys 
180 185 190 

Ala Leu Ala Glu He Val Asn His Gly Leu He Thr Val Gly Lys Asp 
195 200 205 

Gly Ser Val Asn Leu He Gly Gly Lys Val Lys Asn Glu Gly Val He 
210 215 220 

Ser Val Asn Gly Gly Ser He Ser Leu Leu Ala Gly Gin Lys He Thr 
225 230 235 240 

He Ser Asp He lie Asn Pro Thr He Thr Tyr Ser He Ala Ala Pro 
245 250 255 

Glu Asn Glu Ala He Asn Leu Gly Asp He Phe Ala Lys Gly Gly Asn 
260 265 270 

He Asn Val Arg Ala Ala Thr He Arg Asn Lys Gly Lys Leu Ser Ala 
275 280 285 

Asp Ser Val Ser Lys Asp Lys Ser Gly Asn He Val Leu Ser Ala Lys 
290 295 300 

Glu Gly Glu Ala Glu He Gly Gly Val He Ser Ala Gin Asn Gin Gin 
305 310 315 320 

Ala Lys Gly Gly Lys Leu Met He Thr Gly Asp Lys Val Thr Leu Lys 
325 330 335 
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Thr Gly Ala Val He Asp Leu Ser Gly Lys Glu Gly Gly Glu Thr Tyr 
340 345 * 350 

Leu Gly Gly Asp Glu Arg Gly Glu Gly Lys Asn Gly He Gin Leu Ala 
355 360 365 

Lys Lys Thr Thr Leu Glu Lys Gly Ser Thr He Asn Val Ser Gly Lvs 
370 375 380 

Glu Lys Gly Gly Arg Ala He Val Tm Gly Asd He Ala Leu He Asp 
385 390 395 400 

Gly Asn lie Asn Ala Gin Gly Lys Asp He Ala Lys Thr Gly Gly Phe 
405 410 415 

Val Glu Thr Ser Gly His Tyr Leu Ser He Asp Asp Asn Ala He Val 
420 425 430 

Lys Thr Lys Glu Trp Leu Leu Asp Pro Glu Asn Val Thr He Glu Ala 
435 440 445 

Pro Ser Ala Ser Arg Val Glu Leu Gly Ala Asd Arg Asn Ser His Ser 
450 455 460 

Ala Glu Val He Lys Val Thr Leu Lys Lys Asn Asn Thr Ser Leu Thr 
465 470 475 480 

Thr Leu Thr Asn Thr Thr He Ser Asn Leu Leu Lys Ser Ala His Val 
485 490 * 495 

Val Asn He Thr Ala Arg Arg Lys Leu Thr Val Asn Ser Ser He Ser 
500 505 510 

He Glu Arg Gly Ser His Leu lie Leu His Ser Glu Gly Gin Gly Gly 
515 520 525 

Gin Gly Val Gin He Asp Lys Asp He Thr Ser Glu Gly Gly Asn Leu 
530 535 540 

Thr He Tyr Ser Gly Gly Trp Val Asp Val His Lys Asn lie Thr Leu 
545 550 555 560 

Gly Ser Gly Phe Leu Asn lie Thr Thr Lys Glu Gly Asp He Ala Phe 
565 570 575 

Glu Asp Lys Ser Gly Arg Asn Asn Leu Thr He Thr Ala Gin Gly Thr 
580 ^ 585 590 

He Thr Ser Gly Asn Ser Asn Gly Phe Arg Phe Asn Asn Val Ser Leu 
595 600 605 

Asn Ser Leu Gly Gly Lys Leu Ser Phe Thr Asp Ser Arg Glu Asp Arg 
610 615 620 

Gly Arg Arg Thr Lys Gly Asn He Ser Asn Lys Phe Asp Gly Thr Leu 
625 630 635 640 

Asn He Ser Gly Thr Val Asp lie Ser Met Lys Ala Pro Lys Val Ser 
645 650 655 

Trp Phe Tyr Arg Asp Lys Gly Arg Thr Tyr Trp Asn Val Thr Thr Leu 
660 665 670 

Asn Val Thr Ser Gly Ser Lys Phe Asn Leu Ser He Asp Ser Thr Gly 
675 680 685 
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Ser Gly Ser Thr Gly Pro Ser He Arg Asn Ala Glu Leu Asn Gly tie 
• 69S 700 

Thr Phe Asn Lys Ala Thr Phe Asn He Ala Gin Gly Ser Thr Ala Asn 

715 720 

Phe ser He Lys Ala Ser He Met Pro Phe Lys Ser Asn Ala Asn Tyr 
725 730 



73S 



Ala Leu Phe Asn Glu Asp He Ser Val Ser Gly Gly Gly Ser Val Asn 
740 7 «S 750 

Phe Lys Leu Asn Ala Ser Ser Ser Asn He Gin Thr Pro Gly Val lie 

760 765 

lie Lys Ser Gin Asn Phe Asn Val Ser Gly Gly Ser Thr Leu Asn Leu 
u 775 780 

Lys Ala Glu Gly Ser Thr Glu Thr Ala Phe Ser He Glu Asn Asp Leu 

795 800 
Asn Leu Asn Ala Thr Gly Gly Asn lie Thr He Arg Gin Val Glu Gly 
805 810 eiS 

Thr Asp Ser Arg Val Asn Lys Gly Val Ala Ala Lys Lys Asn He Thr 
820 82S 830 

Phe Lys Gly Gly Asn He Thr Phe Gly Ser Gin Lys Ala Thr Thr Glu 

He Lys Gly Asn Val Thr He Asn Lys Asn Thr Asn Ala Thr Leu Arg 

Gly Ala Asn Phe Ala Glu Asn Lys Ser Pro Leu Asn He Ala Gly Asn 
865 870 875 y ego 

Val He Asn Asn Gly Asn Leu Thr Thr Ala Gly Ser He He Asn He 
885 890 895 

Ala Gly Asn Leu Thr Val Ser Lys Gly Ala Asn Leu Gin Ala lie Thr 
900 90s 910 

Asn Tyr Thr Phe Asn Val Ala Gly Ser Phe Asp Asn Asn Gly Ala Ser 
915 920 92S 

Asn lie Ser He Ala Arg Gly Gly Ala Lys Phe Lys Asp He Asn Asn 
930 935 94Q 

Thr Ser Ser Leu Asn He Thr Thr Asn Ser Asp Thr Thr Tyr Arg Thr 
5 950 955 " geo 

He He Lys Gly Asn He Ser Asn Lys Ser Gly Asp Leu Asn He He 
965 970 " g 75 

Asp Lys Lys Ser Asp Ala Glu He Gin He Gly Gly Asn He Ser Gin 
980 985 990 

Lys Glu Gly Asn Leu Thr He Ser Ser Asp Lys Val Asn He Thr Asn 
"S 1000 10 os 

Gln J™ n Thr 116 LyS Ala Gly Val Glu G1 y ^9 Ser Asp Ser Ser 

1010 ioiS 1020 

Glu Ala Glu Asn Ala Asn Leu Thr He Gin Thr Lys Glu Leu Lys Leu 
1025 1030 1035 J 10 40 
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Ala ciy Asp Leu As^Ile Sar Gly Phe Asn Lys Ala Glu n- Thr ^ 

1050 1055 • 

Lys Asn Gly Ser Asp Leu Thr He Gly Asn Ala Ser Gly Gly Asn Ala 

1065 107Q 

Asp Ala Ly^Lys Val Thr Phe Asp L ys Val Lys Asp Ser Lys ^ ^ 

1080 1Q85 

Thr Asp Gly His Asn Val Thr Leu Asn Ser Glu Val Lys Thr Ser Asn 

1095 1100 
Gly ser Ser Asn Ala Gly Asn Asp Asn Ser Thr Gly Leu Thr He Ser 

1110 1115 112 

Ala Lys Asp Val Thr Val Asn Asn Asn Val Thr Ser His Lvs Thr He 

1130 1135 
Asn lie Ser Alalia Ala Gly Asn Val Thr Thr Lys Glu Gly Thr Thr 

1145 1150 

He Asn Ala Thr Thr Gly Ser Val Glu Val Thr Ala Gin Asn Gly Thr 

1160 1165 

He Lys Gly Asn He Thr Ser Gin Asn Val Thr Val Thr Ala Thr Glu 

1175 1180 

nes LSU T ^ ?^ 0 ASn Ala Val Ile *•» ^la Thr Ser Gly Thr 

1195 1200 
Val Asn He Ser Th^Lys Thr Gly Asp He Lys Gly Gly He Glu Ser 

1210 1215 
Thr Ser Gly Asn Val Asn He Thr Ala Ser Gly Asn Thr Leu Lys Val 

1225 1230 
Ser Asn Jl^Thr Gly Gin Asp Val^hr Val Thr Ala Asp Ala Gly Ala 

Leu Thr Thr Thr Ala Gly Ser Thr He Ser Ala Thr Thr Gly Asn Al-a 

1255 1260 



S Ile ^ ^ ^ £ 70 Gly AS P Ile Asn G ^ Val Glu ser Ser 

1275 128 

Ser Gly Ser Val Thr Leu Val Ala Thr Gly Ala Thr Leu Ala Val Gly 
1285 1290 1 



1295 



Asn lie Ser GlyAsn Thr Val Thr Il^Thr Ala Asp Ser Gly^Lys Leu 

Thr Ser Thr Val Gly Ser Thr lie Asn Gly Thr Asn Ser Val Thr Thr 

1320 1325 
Ser SerGln Ser Gly Asp lie Glu Gly Thr lie Ser Gly Asn Thr Val 



1340 



Asn^Val Thr Ala Ser Thr Gly Asp Leu Thr lie Gly Asn Ser Ala Lys 

1355 1360 
Val Glu Ala Lys Asn Gly Ala Ala Thr Leu Thr Ala Glu Ser Gly Lys 
1365 "70 13 } s X 

Leu Thr Thr Gin Thr Gly Ser Ser lie Thr Ser Ser Asn Gly Gin Thr 

138S 1390 
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Thr Leu Thr Ala Lys Asp S er s r He Ala Gly Asn He Asn Ala Ala 

1400 1405 

Asn v.l Thr Leu Asn Thr Thr Gly Thr Leu Thr Thr Thr Gly Asp Ser 

1415 1420 

Ly^Ile Asn Ala Thr s« Gly Thr Leu Thr Xle Asn Ala Lys Asp Ala 



1435 144 
Lys Leu Asp Gly Ala Ala Ser Gly Asp Arg Thr Val Val Asn Ala Thr 

1445 14 so 14S5 

Asn Ala Ser Gly Ser Gly Asn Val Thr Ala Lys Thr Ser Ser Ser Val 

1465 1470 

Asn He Thr Gly Asp Leu Asn Thr He Asn Gly Leu Asn He He Ser 

1480 1485 

Glu Asn^Gly Arg Asn Thr Val Arg Leu Arg Gly Lys Glu He Asp Val 

i4yb 1500 

Ly^Tyr He Gin Pro GlyVal Ala Ser Val oiuGlu Val He Glu Ala 
Lys Arg Val Leu Gl^Lys Val Lys Asp Leaser Asp Glu Glu Arg Glu 



1535 



Thr Leu Ala Ly^Leu Gly Val Ser Ala Val Arg Phe Val Glu Pro Asn 

1545 1550 

Asn Ala lie Thr Val Asn Thr Gin Asn Glu Phe Thr Thr Lys Pro Ser 

1560 1565 

Ser Gin Val Thr He Ser Glu Gly Lys Ala Cys Phe Ser Ser Gly Asn 

J-b/b 1580 

Gly Ala Arg Val Cys Thr Asn Val Ala Asp Asp Gly Gin Gin Pro 

1590 1595 

INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1600 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Asn Lys He Tyr Arg Leu Lys Phe Ser Lys Arg Leu Asn Ala Leu 



10 15 



Val Ala Val Ser Glu Leu Thr Arg Gly Cys Asp His Ser Thr Glu Lys 

Gly Ser Glu Lys Pro Val Arg Thr Lys Val Arg His Leu Ala Leu Lys 

40 45 2 

Pro Leu Ser Ala He Leu Leu Ser Leu Gly Met Ala Ser He Pro Gin 

55 60 

Ser Val Leu Ala Ser Gly Leu Gin Gly Met Ser Val Val His Gly Thr 

75 80 
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». - H« 01„ „ ?1 Asp Gly As „ ^ t ^ _ ^ ^ 

*» «. «• n. to Trp Lys 8ln PJ| As „ lle ssp - ^ 

Oln Ph. ,. u G ,, 01u s „ ? „ Asn ^ r m> v ^ ^ ^ ^ ^ 

- S j «. u. S „ g; teu Lys Gly lle ^ ser Asn ciy 

^ iS *" »J «, ^ Ala 

xi. to Tte Gly phe Thr Ma ^ ^ ^ - 

«u ta XX. jjj „. srg A „ phe l ^ nu ^ ^ ^ 

* 190 

Ala Leu Ala Glu lie 1 a«« u - 

Val Asn Hxs Gly Leu lie Thr Val Gly Lys Asp 

205 

Gly Ser Val Asn Leu H*» rh, n 

He Gly Gly Lys Val Lys ^ ^ 

220 

Ser Val Asn Gly Glv Ser- rio o 

^ 2I0 ^ LeU ^y Gin Lys Ile Thr 

He Ser Asp He He Asn Pro Thr He Thr- t, o 

24 s Tnr He Thr Tyr Ser He Ala Ala Pro 

Glu Asn Glu Ala He Asn Leu Gly Asn He ph s1 

260 Xy As f Ile Phe Ala Lys Gly Gly Asn 

xx. «. v., Ar3 u , A1 , Thr ^ Arg ^ ^ ^ ^ ^ 

S.r V.X s« Lys Asp s „ Gly Asn ^ ™ s ^ ^ ^ 

Glu Gly Glu Ala Glu He Glv <-i„ i, T , 

305 310 ^ Val Ile Hi A ^ Gin Asn Gin Gin 

Al. X*. Gly Gly Lys Leu Me t He Thr Gly Asp Lys Val Thr Leu Z 

Thr Gly Ala Val He Asp Leu Ser Gly Lys Glu Gly Gly Glu 2 ^ 

350 

" ° ly ?» G - «y «; iv L ys Asn „ y n. Gln Leu Ala 

^ Ly S Thr Thr Leu Glu Ly f Gly Ser Thr ^ ^ ^ ^ ^ ^ 
Glu Lys Gly Gly Arg Ala Ile Va 

38S 39Q <TP <*xy Asp He Ala Leu H e Asp 

Gl y Asn He Asn Ala Gin Glv a t , & ° 

405 ° ly SSr *«* Hi Al. Lys Thr Gly jly Phe 

Val Glu Thr Ser Gly His Asp Leu Ser He Glv » » 

420 42 S y P ^ P Val Ile 



430 
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Asp Ala Lys Glu Trp Leu Leu Asp Pro Asp Asp Val Ser He Glu Thr 

440 445 

Leu Thr Ser Gly Arg Asn Asn Thr Gly Glu Asn Gin Gly Tyr Thr Thr 

455 460 

Gly Asp Gly Thr Lys Glu Ser Pro Lys Gly Asn Ser lie Ser Lys Pro 

470 4-75 480 

Thr Leu Thr Asn Ser Thr Leu Glu Gin He Leu Arg Arg Gly Ser Tyr 
48S 490 495 

Val Asn He Thr Ala Asn Asn Arg lie Tyr Val Asn Ser Ser He Asn 

Leu Ser Asn Gly Ser Leu Thr Leu His Thr Lys Arg Asp Gly Val Lys 
515 520 525 

He Asn Gly Asp He Thr Ser Asn Glu Asn Gly Asn Leu Thr He Lys 

535 



540 



Ala Gly Ser Trp Val Asp Val His Lys Asn He Thr Leu Gly Thr Gly 

550 555 560 - 

Phe Leu Asn He Val Ala Gly Asp Ser Val Ala Phe Glu Arg Glu Gly 
S65 - 570 575 

Asp Lys Ala Arg Asn Ala Thr Asp Ala Gin He Thr Ala Gin Gly Thr 
580 585 5go 

He Thr Val Asn Lys Asp Asp Lys Gin Phe Arg Phe Asn Asn Val Ser 
byb 600 60S 

Leu Asn Gly Thr Gly Lys Gly Leu Lys Phe He Ala Asn Gin Asn Asn 
610 61S 620 

Phe Thr His Lys Phe Asp Gly Glu He Asn He Ser Gly He Val Thr 
625 630 635 64 o 

He Asn Gin Thr Thr Lys Lys Asp Val Lys Tyr Trp Asn Ala Ser Lys 
645 650 ess 

Asp Ser Tyr Trp Asn Val Ser Ser Leu Thr Leu Asn Thr Val Gin Lys 
660 66S 670 

Phe Thr Phe He Lys Phe Val Asp Ser Gly Ser Asn Gly Gin Asp Leu 
6'5 680 685 

^ f t « SSr Arg Ser Phe Ala G1 y Val Hi s Phe Asn Gly He Gly 

690 695 700 

Gly Lys Thr Asn Phe Asn He Gly Ala Asn Ala Lys Ala Leu Phe Lys 
705 710 715 720 

Leu Lys Pro Asn Ala Ala Thr Asp Pro Lys Lys Glu Leu Pro He Thr 
725 730 73s 

Phe Asn Ala Asn He Thr Ala Thr Gly Asn Ser Asp Ser Ser Val Met 
740 74 5 75Q 

Phe Asp He His Ala Asn Leu Thr Ser Arg Ala Ala Gly He Asn Met 
755 760 76S 

Asp Ser He Asn He Thr Gly Gly Leu Asp Phe Ser He Thr Ser His 
770 775 7eo 
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Asn Arc Asn Ser Asn Ala Phe Glu He Lys Lys Asp Leu Thr lie Asn 

Ala Thr Gly Ser Asn Phe Ser Leu Lys Gin T hr Lys Asp Ser ^ ^ 

Asn Glu Tyr ser Lys His Ala He Asn Ser Ser His Asn Leu Thr xie 

Leu Gly Oly Asn Val Thr Leu oly Oly Olu Asn Ser Ser Ser Ser He 

840 845 
Thr Gly Asn He Asn He Thr Asn Lys Ala Asn Val Thr Leu Oln Ala 

855 860 

86 5 Thr G1 V ^ Lys Arg Thr Leu Thr Leu 

Gly Asn He Ser Val Glu Gly Asn Leu Ser Leu Thr Gly Ala Asn Ala 

890 8gs 

Asn lie val Gly Asn Leu Ser He Ala Glu Asp Ser Thr Phe Lys Gly 

905 giQ 

Glu Ala Ser Asp Asn Leu Asn lie Thr Gly Thr Phe Thr Asn Asn Gly 

920 925 

Thr Ala Asn He Asn He Lys Gly Val Val Lys Leu Gly Asp He Asn 

yjb 940 
Asn Lys Gly Oly Leu Asn He Thr Thr Asn Ala Ser Oly Thr Gin Lys 

955 960 
Thr He He Asn Gly Asn He Thr Asn Glu Lys Gly Asp Leu Asn He 

970 975 

Lys Asn He Lys Ala Asp Ala Glu lie Gin lie Oly Gly Asn He Ser 

985 9go 

Gin Lys Glu Gly Asn Leu Thr lie Ser Ser Asp Lys Val Asn He Thr 

iooo 1005 

Asn Gin lie Thr He Lys Ala Gly Val Glu Gly Gly Arg Ser Asp Ser 

1015 1020 

SjrGlu Ala Glu Asn Ala Asn Leu Thr He Gin Thr Lys Glu Leu Lys 

1035 1040 
Leu Ala Gly Asp Leu Asn He Ser Gly Phe Asn Lys Ala Glu He Thr 

5 1050 10S5 

Ala Lys Asn Gly Ser Asp Leu Thr lie Gly Asn Ala Ser Gly Gly Asn 

1065 1070 

Ala Asp Ala Lys Lys Val Thr Phe Asp Lys Val Lys Asp Ser Lys He 

1O80 1085 

Iof 0 ASP Y£ s Tbr ^ S <~ Glu Val Lys Thr Ser 

iuy5 1100 

Asn Gly Ser Ser Asn Ala Gly Asn Asp Asn Ser Thr Gly Leu Thr He 

11X5 



1085 
\ 

1100 

1115 1120 
Ser Ala Lys Asp Val Thr Val Asn Asn Asn Val Thr Ser His Lys Thr 

1130 1135 
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II. Asn Ile s« A1 . Ala Ala Gly ^ v-1 G ^ Gly Thr 

Ss^ T ^ ^ f«val Glu Val Thr ; i. s Gln 0 A,„ Gly 

Thr Xl^Lys ciy Asn Xle Thr Ser Gin Asn Val T hr v al Thr Ala Thr 

■ LX/b 1180 
Glu Asn Leu Val Thr Thr Glu Asn Ala v^l n a 

1185 119Q U ^ Sn Ai ~ Val He Asn Ala Thr Ser Gly 

1195 120Q 



Thr val Asn Ile se^Thr Lys Thr Gly l ^ i 

1210 1215 

Ser Thr ser ox LAs „ val Asn Ile s ^ ala ny ^ 

v.l Ser ^xx. Thr Gly Q1 „ „^ V>1 Thr ^ „^^" Ala oly 

Al. guar Thr Thr „. Thr Ue a ^ 

1260 

AlaAsn Xle Thr Thr Ly^Thr Gly Asp Ile Asn Qly Lyg ^ ^ ^ 

1275 128C 

Ser Ser Gly Ser Val Thr Leu Val Ala Thr- n, m ^ 

1285 a Toon 7 Thr LeU Ala V *l 

x^yo 1295 

G ly As „ He s.rGly As „ Thr val Thr Ma ^ 

x-eu Thr Se^Thr v.l oly Ser Asn „ y Thr Jjj^Tvm Thr 

Thr s«s„ am ser al y jjp ». olu Gly Thr Ile ^" ftsn 

- LJ>J: > 1340 
V.l s A.n Val Thr Ala SerThr Gly Asp Leu Thr^Xle Gly Asn Ser Ala 

Lys Val Glu Ala Ly^Asn Gly Ma Ala Thr Leu ^ ^ ^ ^ 

1370 1375 

Lys ,ee Thr Throln Thr Qly ser Sejn, Thr Ser Ser ^y „ 

Thr Thr Le^Thr A ie Lys ^ 3 ^ „, ^ ^ ™ 

1400 1405 
Ala Asn^al Thr Leu Asn Thr Thr Gly Thr Leu Thr Thr Thr Gly 

-L415 142Q 



Se^Lys Xle Asn Ala Thr S er Gly Thr Leu Thr Xle Asn Ala Lys Asp 

1435 1440 
Ala Lys Leu Asp GlyAla Ala Ser Gly As^Arg Thr Val Val Asn Ala 



1455 



Thr Asn Ala SerGly Ser Gly Asn Val Thr Ala Lys Thr Ser Ser Ser 



1465 1470 



Val Asn lle^hr Gly Asp Leu Asn Thr Xle Asn Gly Leu Asn Xle lie 

1480 1485 



